Package Bio :: Package Alphabet :: Module Reduced
[hide private]
[frames] | no frames]

Module Reduced

source code

Reduced alphabets which lump together several amino-acids into one letter.

Reduced (redundant or simplified) alphabets are used to represent protein sequences using an alternative alphabet which lumps together several amino-acids into one letter, based on physico-chemical traits. For example, all the aliphatics (I,L,V) are usually quite interchangeable, so many sequence studies group them into one letter

Examples of reduced alphabets are available in:

http://viscose.herokuapp.com/html/alphabets.html

The Murphy tables are from here:

Murphy L.R., Wallqvist A, Levy RM. (2000) Simplified amino acid alphabets for protein fold recognition and implications for folding. Protein Eng. 13(3):149-152

These alphabets have been used with Bio.utils.reduce_sequence, which has been removed from Biopython. You can use this is alphabets and tables like this:

>>> from Bio.Seq import Seq
>>> from Bio import Alphabet
>>> from Bio.Alphabet import Reduced
>>> my_protein = Seq('MAGSKEWKRFCELTINEA', Alphabet.ProteinAlphabet())

Now, we convert this sequence into a sequence which only recognizes polar (P) or hydrophobic (H) residues:

>>> new_protein = Seq('', Alphabet.Reduced.HPModel())
>>> for aa in my_protein:
...     new_protein += Alphabet.Reduced.hp_model_tab[aa]
>>> new_protein
Seq('HPPPPPHPPHHPHPHPPP', HPModel())

The following Alphabet classes are available:

Classes [hide private]
  Murphy15
Reduced protein alphabet with 15 letters.
  Murphy10
Reduced protein alphabet with 10 letters.
  Murphy8
Reduced protein alphabet with 8 letters.
  Murphy4
Reduced protein alphabet with 4 letters.
  HPModel
Reduced protein alphabet with only two letters for polar or hydophobic.
  PC5
Reduced protein alphabet with 5 letters for physico-chemical properties.
Variables [hide private]
  murphy_15_tab = {'A': 'A', 'C': 'C', 'D': 'D', 'E': 'E', 'F': ...
  murphy_15 = Murphy15()
  murphy_10_tab = {'A': 'A', 'C': 'C', 'D': 'E', 'E': 'E', 'F': ...
  murphy_10 = Murphy10()
  murphy_8_tab = {'A': 'A', 'C': 'L', 'D': 'E', 'E': 'E', 'F': '...
  murphy_8 = Murphy8()
  murphy_4_tab = {'A': 'A', 'C': 'L', 'D': 'E', 'E': 'E', 'F': '...
  murphy_4 = Murphy4()
  hp_model_tab = {'A': 'P', 'C': 'H', 'D': 'P', 'E': 'P', 'F': '...
  hp_model = HPModel()
  pc_5_table = {'A': 'T', 'C': 'T', 'D': 'C', 'E': 'C', 'F': 'R'...
  pc5 = PC5()
  __package__ = 'Bio.Alphabet'
Variables Details [hide private]

murphy_15_tab

Value:
{'A': 'A',
 'C': 'C',
 'D': 'D',
 'E': 'E',
 'F': 'F',
 'G': 'G',
 'H': 'H',
 'I': 'L',
...

murphy_10_tab

Value:
{'A': 'A',
 'C': 'C',
 'D': 'E',
 'E': 'E',
 'F': 'F',
 'G': 'G',
 'H': 'H',
 'I': 'L',
...

murphy_8_tab

Value:
{'A': 'A',
 'C': 'L',
 'D': 'E',
 'E': 'E',
 'F': 'F',
 'G': 'A',
 'H': 'H',
 'I': 'L',
...

murphy_4_tab

Value:
{'A': 'A',
 'C': 'L',
 'D': 'E',
 'E': 'E',
 'F': 'F',
 'G': 'A',
 'H': 'E',
 'I': 'L',
...

hp_model_tab

Value:
{'A': 'P',
 'C': 'H',
 'D': 'P',
 'E': 'P',
 'F': 'H',
 'G': 'P',
 'H': 'P',
 'I': 'H',
...

pc_5_table

Value:
{'A': 'T',
 'C': 'T',
 'D': 'C',
 'E': 'C',
 'F': 'R',
 'G': 'T',
 'H': 'R',
 'I': 'A',
...