Bio.ExPASy.cellosaurus module
Parser for the cellosaurus.txt file from ExPASy.
See https://web.expasy.org/cellosaurus/
Tested with the release of Version 18 (July 2016).
- Functions:
read Reads a file containing one cell line entry
parse Reads a file containing multiple cell line entries
- Classes:
Record Holds cell line data.
Examples
This example downloads the Cellosaurus database and parses it. Note that urlopen returns a stream of bytes, while the parser expects a stream of plain string, so we use TextIOWrapper to convert bytes to string using the UTF-8 encoding. This is not needed if you download the cellosaurus.txt file in advance and open it (see the comment below).
>>> from urllib.request import urlopen
>>> from io import TextIOWrapper
>>> from Bio.ExPASy import cellosaurus
>>> url = "ftp://ftp.expasy.org/databases/cellosaurus/cellosaurus.txt"
>>> bytestream = urlopen(url)
>>> textstream = TextIOWrapper(bytestream, "UTF-8")
>>> # alternatively, use
>>> # textstream = open("cellosaurus.txt")
>>> # if you downloaded the cellosaurus.txt file in advance.
>>> records = cellosaurus.parse(textstream)
>>> for record in records:
... if 'Homo sapiens' in record['OX'][0]:
... print(record['ID'])
...
#15310-LN
#W7079
(L)PC6
0.5alpha
...
- Bio.ExPASy.cellosaurus.parse(handle)
Parse cell line records.
This function is for parsing cell line files containing multiple records.
- Arguments:
handle - handle to the file.
- Bio.ExPASy.cellosaurus.read(handle)
Read one cell line record.
This function is for parsing cell line files containing exactly one record.
- Arguments:
handle - handle to the file.
- class Bio.ExPASy.cellosaurus.Record
Bases:
dict
Holds information from an ExPASy Cellosaurus record as a Python dictionary.
Each record contains the following keys:
Line code
Content
Occurrence in an entry
ID
Identifier (cell line name)
Once; starts an entry
AC
Accession (CVCL_xxxx)
Once
AS
Secondary accession number(s)
Optional; once
SY
Synonyms
Optional; once
DR
Cross-references
Optional; once or more
RX
References identifiers
Optional: once or more
WW
Web pages
Optional; once or more
CC
Comments
Optional; once or more
ST
STR profile data
Optional; twice or more
DI
Diseases
Optional; once or more
OX
Species of origin
Once or more
HI
Hierarchy
Optional; once or more
OI
Originate from same individual
Optional; once or more
SX
Sex of cell
Optional; once
AG
Age of donor at sampling
Optional; once
CA
Category
Once
DT
Date (entry history)
Once
//
Terminator
Once; ends an entry
- __init__()
Initialize the class.
- __repr__()
Return the canonical string representation of the Record object.
- __str__()
Return a readable string representation of the Record object.