Bio.codonalign.codonseq module

Code for dealing with coding sequence.

CodonSeq class is inherited from Seq class. This is the core class to deal with sequences in CodonAlignment in biopython.

class Bio.codonalign.codonseq.CodonSeq(data='', alphabet=CodonAlphabet(Standard), gap_char='-', rf_table=None)

Bases: Bio.Seq.Seq

CodonSeq is designed to be within the SeqRecords of a CodonAlignment class.

CodonSeq is useful as it allows the user to specify reading frame when translate CodonSeq

CodonSeq also accepts codon style slice by calling get_codon() method.

Important: Ungapped CodonSeq can be any length if you specify the rf_table. Gapped CodonSeq should be a multiple of three.

>>> codonseq = CodonSeq("AAATTTGGGCCAAATTT", rf_table=(0,3,6,8,11,14))
>>> print(codonseq.translate())
KFGAKF

test get_full_rf_table method

>>> p = CodonSeq('AAATTTCCCGG-TGGGTTTAA', rf_table=(0, 3, 6, 9, 11, 14, 17))
>>> full_rf_table = p.get_full_rf_table()
>>> print(full_rf_table)
[0, 3, 6, 9, 12, 15, 18]
>>> print(p.translate(rf_table=full_rf_table, ungap_seq=False))
KFPPWV*
>>> p = CodonSeq('AAATTTCCCGGGAA-TTTTAA', rf_table=(0, 3, 6, 9, 14, 17))
>>> print(p.get_full_rf_table())
[0, 3, 6, 9, 12.0, 15, 18]
>>> p = CodonSeq('AAA------------TAA', rf_table=(0, 3))
>>> print(p.get_full_rf_table())
[0, 3.0, 6.0, 9.0, 12.0, 15]
__init__(self, data='', alphabet=CodonAlphabet(Standard), gap_char='-', rf_table=None)

Initialize the class.

__getitem__(self, index)

Return a subsequence of single letter, use my_seq[index].

>>> my_seq = Seq('ACTCGACGTCG')
>>> my_seq[5]
'A'
get_codon(self, index)

Get the index codon from the sequence.

get_codon_num(self)

Return the number of codons in the CodonSeq.

translate(self, codon_table=NCBICodonTable(id=1, names=['Standard', 'SGC0'], ...), stop_symbol='*', rf_table=None, ungap_seq=True)

Translate the CodonSeq based on the reading frame in rf_table.

It is possible for the user to specify a rf_table at this point. If you want to include gaps in the translated sequence, this is the only way. ungap_seq should be set to true for this purpose.

toSeq(self, alphabet=DNAAlphabet())

Convert DNA to seq object.

get_full_rf_table(self)

Return full rf_table of the CodonSeq records.

A full rf_table is different from a normal rf_table in that it translate gaps in CodonSeq. It is helpful to construct alignment containing frameshift.

full_translate(self, codon_table=NCBICodonTable(id=1, names=['Standard', 'SGC0'], ...), stop_symbol='*')

Apply full translation with gaps considered.

ungap(self, gap=None)

Return a copy of the sequence without the gap character(s).

classmethod from_seq(seq, alphabet=CodonAlphabet(Standard), rf_table=None)

Get codon sequence from sequence data.

Bio.codonalign.codonseq.cal_dn_ds(codon_seq1, codon_seq2, method='NG86', codon_table=NCBICodonTable(id=1, names=['Standard', 'SGC0'], ...), k=1, cfreq=None)

Calculate dN and dS of the given two sequences.

Available methods:
Arguments:
  • codon_seq1 - CodonSeq or or SeqRecord that contains a CodonSeq

  • codon_seq2 - CodonSeq or or SeqRecord that contains a CodonSeq

  • w - transition/transversion ratio

  • cfreq - Current codon frequency vector can only be specified when you are using ML method. Possible ways of getting cfreq are: F1x4, F3x4 and F61.