Bio.Alphabet.IUPAC module¶
Standard nucleotide and protein alphabets defined by IUPAC.
-
class
Bio.Alphabet.IUPAC.
ExtendedIUPACProtein
¶ Bases:
Bio.Alphabet.ProteinAlphabet
Extended uppercase IUPAC protein single letter alphabet including X etc.
In addition to the standard 20 single letter protein codes, this includes:
B
= “Asx”; Aspartic acid (R) or Asparagine (N)X
= “Xxx”; Unknown or ‘other’ amino acidZ
= “Glx”; Glutamic acid (E) or Glutamine (Q)J
= “Xle”; Leucine (L) or Isoleucine (I), used in mass-spec (NMR)U
= “Sec”; SelenocysteineO
= “Pyl”; Pyrrolysine
This alphabet is not intended to be used with
X
for Selenocysteine (an ad-hoc standard prior to the IUPAC adoption ofU
instead).-
letters
= 'ACDEFGHIKLMNPQRSTVWYBXZJUO'¶
-
class
Bio.Alphabet.IUPAC.
IUPACProtein
¶ Bases:
Bio.Alphabet.IUPAC.ExtendedIUPACProtein
IUPAC protein alphabet of the 20 standard amino acids.
Uppercase and single letter.
-
letters
= 'ACDEFGHIKLMNPQRSTVWY'¶
-
-
class
Bio.Alphabet.IUPAC.
IUPACAmbiguousDNA
¶ Bases:
Bio.Alphabet.DNAAlphabet
Uppercase IUPAC ambiguous DNA.
-
letters
= 'GATCRYWSMKHBVDN'¶
-
-
class
Bio.Alphabet.IUPAC.
IUPACUnambiguousDNA
¶ Bases:
Bio.Alphabet.IUPAC.IUPACAmbiguousDNA
Uppercase IUPAC unambiguous DNA (letters GATC only).
-
letters
= 'GATC'¶
-
-
class
Bio.Alphabet.IUPAC.
ExtendedIUPACDNA
¶ Bases:
Bio.Alphabet.DNAAlphabet
Extended IUPAC DNA alphabet.
In addition to the standard letter codes GATC, this includes:
B
= 5-bromouridineD
= 5,6-dihydrouridineS
= thiouridineW
= wyosine
-
letters
= 'GATCBDSW'¶
-
class
Bio.Alphabet.IUPAC.
IUPACAmbiguousRNA
¶ Bases:
Bio.Alphabet.RNAAlphabet
Uppercase IUPAC ambiguous RNA.
-
letters
= 'GAUCRYWSMKHBVDN'¶
-
-
class
Bio.Alphabet.IUPAC.
IUPACUnambiguousRNA
¶ Bases:
Bio.Alphabet.IUPAC.IUPACAmbiguousRNA
Uppercase IUPAC unambiguous RNA (letters GAUC only).
-
letters
= 'GAUC'¶
-