Bio.codonalign.codonseq module

Code for dealing with coding sequence.

CodonSeq class is inherited from Seq class. This is the core class to deal with sequences in CodonAlignment in biopython.

class Bio.codonalign.codonseq.CodonSeq(data='', gap_char='-', rf_table=None)

Bases: Seq

CodonSeq is designed to be within the SeqRecords of a CodonAlignment class.

CodonSeq is useful as it allows the user to specify reading frame when translate CodonSeq

CodonSeq also accepts codon style slice by calling get_codon() method.

Important: Ungapped CodonSeq can be any length if you specify the rf_table. Gapped CodonSeq should be a multiple of three.

>>> codonseq = CodonSeq("AAATTTGGGCCAAATTT", rf_table=(0,3,6,8,11,14))
>>> print(codonseq.translate())
KFGAKF

test get_full_rf_table method

>>> p = CodonSeq('AAATTTCCCGG-TGGGTTTAA', rf_table=(0, 3, 6, 9, 11, 14, 17))
>>> full_rf_table = p.get_full_rf_table()
>>> print(full_rf_table)
[0, 3, 6, 9, 12, 15, 18]
>>> print(p.translate(rf_table=full_rf_table, ungap_seq=False))
KFPPWV*
>>> p = CodonSeq('AAATTTCCCGGGAA-TTTTAA', rf_table=(0, 3, 6, 9, 14, 17))
>>> print(p.get_full_rf_table())
[0, 3, 6, 9, 12.0, 15, 18]
>>> p = CodonSeq('AAA------------TAA', rf_table=(0, 3))
>>> print(p.get_full_rf_table())
[0, 3.0, 6.0, 9.0, 12.0, 15]

__init__(data='', gap_char='-', rf_table=None): Initialize the class.

get_codon(index): Get the index codon from the sequence.

get_codon_num(): Return the number of codons in the CodonSeq.

translate(codon_table=None, stop_symbol='*', rf_table=None, ungap_seq=True)

Translate the CodonSeq based on the reading frame in rf_table.

It is possible for the user to specify a rf_table at this point. If you want to include gaps in the translated sequence, this is the only way. ungap_seq should be set to true for this purpose.

toSeq(): Convert DNA to seq object.

get_full_rf_table()

Return full rf_table of the CodonSeq records.

A full rf_table is different from a normal rf_table in that it translate gaps in CodonSeq. It is helpful to construct alignment containing frameshift.

full_translate(codon_table=None, stop_symbol='*'): Apply full translation with gaps considered.

ungap(gap='-'): Return a copy of the sequence without the gap character(s).

classmethod from_seq(seq, rf_table=None): Get codon sequence from sequence data.

__abstractmethods__ = frozenset({})

__annotations__ = {'_data': 'bytes | SequenceDataAbstractBaseClass'}

Bio.codonalign.codonseq.cal_dn_ds(codon_seq1, codon_seq2, method='NG86', codon_table=None, k=1, cfreq=None)

Calculate dN and dS of the given two sequences.

Available methods:

NG86 - Nei and Gojobori (1986) (PMID 3444411).
LWL85 - Li et al. (1985) (PMID 3916709).
ML - Goldman and Yang (1994) (PMID 7968486).
YN00 - Yang and Nielsen (2000) (PMID 10666704).

Arguments:

codon_seq1 - CodonSeq or or SeqRecord that contains a CodonSeq
codon_seq2 - CodonSeq or or SeqRecord that contains a CodonSeq
w - transition/transversion ratio
cfreq - Current codon frequency vector can only be specified when you are using ML method. Possible ways of getting cfreq are: F1x4, F3x4 and F61.