Bio.SwissProt package

Module contents

Code to work with the sprotXX.dat file from SwissProt.

  • Record Holds SwissProt data.

  • Reference Holds reference data from a SwissProt record.

  • read Read one SwissProt record

  • parse Read multiple SwissProt records

class Bio.SwissProt.Record

Bases: object

Holds information from a SwissProt record.

  • entry_name Name of this entry, e.g. RL1_ECOLI.

  • data_class Either ‘STANDARD’ or ‘PRELIMINARY’.

  • molecule_type Type of molecule, ‘PRT’,

  • sequence_length Number of residues.

  • accessions List of the accession numbers, e.g. [‘P00321’]

  • created A tuple of (date, release).

  • sequence_update A tuple of (date, release).

  • annotation_update A tuple of (date, release).

  • description Free-format description.

  • gene_name Gene name. See userman.txt for description.

  • organism The source of the sequence.

  • organelle The origin of the sequence.

  • organism_classification The taxonomy classification. List of strings. (

  • taxonomy_id A list of NCBI taxonomy id’s.

  • host_organism A list of names of the hosts of a virus, if any.

  • host_taxonomy_id A list of NCBI taxonomy id’s of the hosts, if any.

  • references List of Reference objects.

  • comments List of strings.

  • cross_references List of tuples (db, id1[, id2][, id3]). See the docs.

  • keywords List of the keywords.

  • features List of tuples (key name, from, to, description). from and to can be either integers for the residue numbers, ‘<’, ‘>’, or ‘?’

  • protein_existence Numerical value describing the evidence for the existence of the protein.

  • seqinfo tuple of (length, molecular weight, CRC32 value)

  • sequence The sequence.


>>> import Bio.SwissProt as sp
>>> example_filename = "SwissProt/sp008"
>>> with open(example_filename) as handle:
...     records = sp.parse(handle)
...     for record in records:
...         print(record.entry_name)
...         print(",".join(record.accessions))
...         print(record.keywords)
...         print(repr(record.organism))
...         print(record.sequence[:20] + "...")
['MHC I', 'Transmembrane', 'Glycoprotein', 'Signal', 'Polymorphism', '3D-structure']
'Homo sapiens (Human).'

Initialize the class.

class Bio.SwissProt.Reference

Bases: object

Holds information from one reference in a SwissProt entry.

  • number Number of reference in an entry.

  • evidence Evidence code. List of strings.

  • positions Describes extent of work. List of strings.

  • comments Comments. List of (token, text).

  • references References. List of (dbname, identifier).

  • authors The authors of the work.

  • title Title of the work.

  • location A citation for the work.


Initialize the class.


Read multiple SwissProt records from file handle.

Returns a generator object which yields Bio.SwissProt.Record() objects.

Read one SwissProt record from file handle.

Returns a Record() object.