Bio.SCOP package

Module contents

SCOP: Structural Classification of Proteins.

The SCOP database aims to provide a manually constructed classification of all know protein structures into a hierarchy, the main levels of which are family, superfamily and fold.

The Scop object in this module represents the entire SCOP classification. It can be built from the three SCOP parsable files, modified is so desired, and converted back to the same file formats. A single SCOP domain (represented by the Domain class) can be obtained from Scop using the domain’s SCOP identifier (sid).

  • nodeCodeDict – A mapping between known 2 letter node codes and a longer

    description. The known node types are ‘cl’ (class), ‘cf’ (fold), ‘sf’ (superfamily), ‘fa’ (family), ‘dm’ (domain), ‘sp’ (species), ‘px’ (domain). Additional node types may be added in the future.

This module also provides code to access SCOP over the WWW.

  • search – Access the main CGI script.

  • _open – Internally used function.

Bio.SCOP.cmp_sccs(sccs1, sccs2)

Order SCOP concise classification strings (sccs).

a.4.5.1 < a.4.5.11 < b.1.1.1

A sccs (e.g. a.4.5.11) compactly represents a domain’s classification. The letter represents the class, and the numbers are the fold, superfamily, and family, respectively.


Convert an ASTRAL header string into a Scop domain.

An ASTRAL ( header contains a concise description of a SCOP domain. A very similar format is used when a Domain object is converted into a string. The Domain returned by this method contains most of the SCOP information, but it will not be located within the SCOP hierarchy (i.e. The parent node will be None). The description is composed of the SCOP protein and species descriptions.

A typical ASTRAL header looks like – >d1tpt_1 a.46.2.1 (1-70) Thymidine phosphorylase {Escherichia coli}

class Bio.SCOP.Scop(cla_handle=None, des_handle=None, hie_handle=None, dir_path=None, db_handle=None, version=None)

Bases: object

The entire SCOP hierarchy.

root – The root node of the hierarchy

__init__(self, cla_handle=None, des_handle=None, hie_handle=None, dir_path=None, db_handle=None, version=None)

Build the SCOP hierarchy from the SCOP parsable files, or a sql backend.

If no file handles are given, then a Scop object with a single empty root node is returned.

If a directory and version are given (with dir_path=.., version=…) or file handles for each file, the whole scop tree will be built in memory.

If a MySQLdb database handle is given, the tree will be built as needed, minimising construction times. To build the SQL database to the methods write_xxx_sql to create the tables.


Get root node.

getDomainBySid(self, sid)

Return a domain from its sid.

getNodeBySunid(self, sunid)

Return a node from its sunid.


Return an ordered tuple of all SCOP Domains.

write_hie(self, handle)

Build an HIE SCOP parsable file from this object.

write_des(self, handle)

Build a DES SCOP parsable file from this object.

write_cla(self, handle)

Build a CLA SCOP parsable file from this object.

getDomainFromSQL(self, sunid=None, sid=None)

Load a node from the SQL backend using sunid or sid.

getAscendentFromSQL(self, node, type)

Get ascendents using SQL backend.

getDescendentsFromSQL(self, node, type)

Get descendents of a node using the database backend.

This avoids repeated iteration of SQL calls and is therefore much quicker than repeatedly calling node.getChildren().

write_hie_sql(self, handle)

Write HIE data to SQL database.

write_cla_sql(self, handle)

Write CLA data to SQL database.

write_des_sql(self, handle)

Write DES data to SQL database.

class Bio.SCOP.Node(scop=None)

Bases: object

A node in the Scop hierarchy.

  • sunid – SCOP unique identifiers. e.g. ‘14986’

  • parent – The parent node

  • children – A list of child nodes

  • sccs – SCOP concise classification string. e.g. ‘a.1.1.2’

  • type – A 2 letter node type code. e.g. ‘px’ for domains

  • description – Description text.

__init__(self, scop=None)

Initialize a Node in the scop hierarchy.

If a Scop instance is provided to the constructor, this will be used to lookup related references using the SQL methods. If no instance is provided, it is assumed the whole tree exists and is connected.


Represent the node as a string.


Return an Hie.Record.


Return a Des.Record.


Return a list of children of this Node.


Return the parent of this Node.

getDescendents(self, node_type)

Return a list of all descendant nodes of the given type.

Node type can be a two letter code or longer description, e.g. ‘fa’ or ‘family’.

getAscendent(self, node_type)

Return the ancenstor node of the given type, or None.

Node type can be a two letter code or longer description, e.g. ‘fa’ or ‘family’.

class Bio.SCOP.Domain(scop=None)

Bases: Bio.SCOP.Node

A SCOP domain. A leaf node in the Scop hierarchy.

  • sid - The SCOP domain identifier. e.g. "d5hbib_"

  • residues - A Residue object. It defines the collection of PDB atoms that make up this domain.

__init__(self, scop=None)

Initialize a SCOP Domain object.


Represent the SCOP Domain as a string.


Return a Des.Record.


Return a Cla.Record.

class Bio.SCOP.Astral(dir_path=None, version=None, scop=None, astral_file=None, db_handle=None)

Bases: object

Representation of the ASTRAL database.

Abstraction of the ASTRAL database, which has sequences for all the SCOP domains, as well as clusterings by percent id or evalue.

__init__(self, dir_path=None, version=None, scop=None, astral_file=None, db_handle=None)

Initialize the astral database.

You must provide either a directory of SCOP files:
  • dir_path - string, the path to location of the scopseq-x.xx directory

    (not the directory itself), and

  • version -a version number.

or, a FASTA file:
  • astral_file - string, a path to a fasta file (which will be loaded in memory)

or, a MYSQL database:
  • db_handle - a database handle for a MYSQL database containing a table ‘astral’ with the astral data in it. This can be created using writeToSQL.

domainsClusteredByEv(self, id)

Get domains clustered by evalue.

domainsClusteredById(self, id)

Get domains clustered by percentage identity.

getAstralDomainsFromFile(self, filename=None, file_handle=None)

Get the scop domains from a file containing a list of sids.

getAstralDomainsFromSQL(self, column)

Load ASTRAL domains from the MySQL database.

Load a set of astral domains from a column in the astral table of a MYSQL database (which can be created with writeToSQL(…).

getSeqBySid(self, domain)

Get the seq record of a given domain from its sid.

getSeq(self, domain)

Return seq associated with domain.

hashedDomainsById(self, id)

Get domains clustered by sequence identity in a dict.

hashedDomainsByEv(self, id)

Get domains clustered by evalue in a dict.

isDomainInId(self, dom, id)

Return true if the domain is in the astral clusters for percent ID.

isDomainInEv(self, dom, id)

Return true if the domain is in the ASTRAL clusters for evalues.

writeToSQL(self, db_handle)

Write the ASTRAL database to a MYSQL database., key=None, sid=None, disp=None, dir=None, loc=None, cgi='', **keywds)

Access SCOP search and return a handle to the results.

Access search.cgi and return a handle to the results. See the online help file for an explanation of the parameters:

Raises an IOError if there’s a network error.