Bio.codonalign.codonseq module¶

Code for dealing with coding sequence.

CodonSeq class is inherited from Seq class. This is the core class to deal with sequences in CodonAlignment in biopython.

class Bio.codonalign.codonseq.CodonSeq(data='', alphabet=CodonAlphabet(Standard), gap_char='-', rf_table=None)¶

Bases: Bio.Seq.Seq

CodonSeq is designed to be within the SeqRecords of a CodonAlignment class.

CodonSeq is useful as it allows the user to specify reading frame when translate CodonSeq

CodonSeq also accepts codon style slice by calling get_codon() method.

Important: Ungapped CodonSeq can be any length if you specify the rf_table. Gapped CodonSeq should be a multiple of three.

>>> codonseq = CodonSeq("AAATTTGGGCCAAATTT", rf_table=(0,3,6,8,11,14))
>>> print(codonseq.translate())
KFGAKF

test get_full_rf_table method

>>> p = CodonSeq('AAATTTCCCGG-TGGGTTTAA', rf_table=(0, 3, 6, 9, 11, 14, 17))
>>> full_rf_table = p.get_full_rf_table()
>>> print(full_rf_table)
[0, 3, 6, 9, 12, 15, 18]
>>> print(p.translate(rf_table=full_rf_table, ungap_seq=False))
KFPPWV*
>>> p = CodonSeq('AAATTTCCCGGGAA-TTTTAA', rf_table=(0, 3, 6, 9, 14, 17))
>>> print(p.get_full_rf_table())
[0, 3, 6, 9, 12.0, 15, 18]
>>> p = CodonSeq('AAA------------TAA', rf_table=(0, 3))
>>> print(p.get_full_rf_table())
[0, 3.0, 6.0, 9.0, 12.0, 15]

__init__(self, data='', alphabet=CodonAlphabet(Standard), gap_char='-', rf_table=None)¶: Initialize the class.

__getitem__(self, index)¶

Return a subsequence of single letter, use my_seq[index].

>>> my_seq = Seq('ACTCGACGTCG')
>>> my_seq[5]
'A'

get_codon(self, index)¶: Get the index codon from the sequence.

get_codon_num(self)¶: Return the number of codons in the CodonSeq.

translate(self, codon_table=NCBICodonTable(id=1, names=['Standard', 'SGC0'], ...), stop_symbol='*', rf_table=None, ungap_seq=True)¶

Translate the CodonSeq based on the reading frame in rf_table.

It is possible for the user to specify a rf_table at this point. If you want to include gaps in the translated sequence, this is the only way. ungap_seq should be set to true for this purpose.

toSeq(self, alphabet=DNAAlphabet())¶: Convert DNA to seq object.

get_full_rf_table(self)¶

Return full rf_table of the CodonSeq records.

A full rf_table is different from a normal rf_table in that it translate gaps in CodonSeq. It is helpful to construct alignment containing frameshift.

full_translate(self, codon_table=NCBICodonTable(id=1, names=['Standard', 'SGC0'], ...), stop_symbol='*')¶: Apply full translation with gaps considered.

ungap(self, gap=None)¶: Return a copy of the sequence without the gap character(s).

classmethod from_seq(seq, alphabet=CodonAlphabet(Standard), rf_table=None)¶: Get codon sequence from sequence data.

Bio.codonalign.codonseq.cal_dn_ds(codon_seq1, codon_seq2, method='NG86', codon_table=NCBICodonTable(id=1, names=['Standard', 'SGC0'], ...), k=1, cfreq=None)¶

Calculate dN and dS of the given two sequences.

Available methods:

NG86 - Nei and Gojobori (1986) (PMID 3444411).
LWL85 - Li et al. (1985) (PMID 3916709).
ML - Goldman and Yang (1994) (PMID 7968486).
YN00 - Yang and Nielsen (2000) (PMID 10666704).

Arguments:

codon_seq1 - CodonSeq or or SeqRecord that contains a CodonSeq
codon_seq2 - CodonSeq or or SeqRecord that contains a CodonSeq
w - transition/transversion ratio
cfreq - Current codon frequency vector can only be specified when you are using ML method. Possible ways of getting cfreq are: F1x4, F3x4 and F61.