Bio.Alphabet package

Module contents

Alphabets used in Seq objects etc to declare sequence type and letters (OBSOLETE).

This is used by sequences which contain a finite number of similar words.

The design of Bio.Aphabet included a number of historic design choices which, with the benefit of hindsight, were regretable. While the details remain to be agreed, we intend to remove or replace Bio.Alphabet in 2020. Please avoid using this module explicitly in your code. See also:

class Bio.Alphabet.Alphabet

Bases: object

Generic alphabet base class.

This class is used as a base class for other types of alphabets.

  • letters - list-like object containing the letters of the alphabet.

    Usually it is a string when letters are single characters.

  • size - size of the alphabet’s letters (e.g. 1 when letters are

    single characters).

size = None
letters = None

Represent the alphabet class as a string for debugging.

contains(self, other)

Test if the other alphabet is contained in this one (OBSOLETE?).

Returns a boolean. This relies on the Alphabet subclassing hierarchy only, and does not check the letters property. This isn’t ideal, and doesn’t seem to work as intended with the AlphabetEncoder classes.

class Bio.Alphabet.SingleLetterAlphabet

Bases: Bio.Alphabet.Alphabet

Generic alphabet with letters of size one.

size = 1
letters = None
class Bio.Alphabet.ProteinAlphabet

Bases: Bio.Alphabet.SingleLetterAlphabet

Generic single letter protein alphabet.

class Bio.Alphabet.NucleotideAlphabet

Bases: Bio.Alphabet.SingleLetterAlphabet

Generic single letter nucleotide alphabet.

class Bio.Alphabet.DNAAlphabet

Bases: Bio.Alphabet.NucleotideAlphabet

Generic single letter DNA alphabet.

class Bio.Alphabet.RNAAlphabet

Bases: Bio.Alphabet.NucleotideAlphabet

Generic single letter RNA alphabet.

class Bio.Alphabet.SecondaryStructure

Bases: Bio.Alphabet.SingleLetterAlphabet

Alphabet used to describe secondary structure.

Letters are ‘H’ (helix), ‘S’ (strand), ‘T’ (turn) and ‘C’ (coil).

letters = 'HSTC'
class Bio.Alphabet.ThreeLetterProtein

Bases: Bio.Alphabet.Alphabet

Three letter protein alphabet.

size = 3
letters = ['Ala', 'Asx', 'Cys', 'Asp', 'Glu', 'Phe', 'Gly', 'His', 'Ile', 'Lys', 'Leu', 'Met', 'Asn', 'Pro', 'Gln', 'Arg', 'Ser', 'Thr', 'Sec', 'Val', 'Trp', 'Xaa', 'Tyr', 'Glx']
class Bio.Alphabet.AlphabetEncoder(alphabet, new_letters)

Bases: object

A class to construct a new, extended alphabet from an existing one.

__init__(self, alphabet, new_letters)

Initialize the class.

__getattr__(self, key)

Proxy method for accessing attributes of the wrapped alphabet.


Represent the alphabet encoder class as a string for debugging.

contains(self, other)

Test if the other alphabet is contained in this one (OBSOLETE?).

This is isn’t implemented for the base AlphabetEncoder, which will always return 0 (False).

class Bio.Alphabet.Gapped(alphabet, gap_char='-')

Bases: Bio.Alphabet.AlphabetEncoder

Alphabets which contain a gap character.

__init__(self, alphabet, gap_char='-')

Initialize the class.

contains(self, other)

Test if the other alphabet is contained in this one (OBSOLETE?).

Returns a boolean. This relies on the Alphabet subclassing hierarchy, and attempts to check the gap character. This fails if the other alphabet does not have a gap character!

class Bio.Alphabet.HasStopCodon(alphabet, stop_symbol='*')

Bases: Bio.Alphabet.AlphabetEncoder

Alphabets which contain a stop symbol.

__init__(self, alphabet, stop_symbol='*')

Initialize the class.

contains(self, other)

Test if the other alphabet is contained in this one (OBSOLETE?).

Returns a boolean. This relies on the Alphabet subclassing hierarchy, and attempts to check the stop symbol. This fails if the other alphabet does not have a stop symbol!