Bio.codonalign.codonseq module
Code for dealing with coding sequence.
CodonSeq class is inherited from Seq class. This is the core class to deal with sequences in CodonAlignment in biopython.
- class Bio.codonalign.codonseq.CodonSeq(data='', gap_char='-', rf_table=None)
Bases:
Seq
CodonSeq is designed to be within the SeqRecords of a CodonAlignment class.
CodonSeq is useful as it allows the user to specify reading frame when translate CodonSeq
CodonSeq also accepts codon style slice by calling get_codon() method.
Important: Ungapped CodonSeq can be any length if you specify the rf_table. Gapped CodonSeq should be a multiple of three.
>>> codonseq = CodonSeq("AAATTTGGGCCAAATTT", rf_table=(0,3,6,8,11,14)) >>> print(codonseq.translate()) KFGAKF
test get_full_rf_table method
>>> p = CodonSeq('AAATTTCCCGG-TGGGTTTAA', rf_table=(0, 3, 6, 9, 11, 14, 17)) >>> full_rf_table = p.get_full_rf_table() >>> print(full_rf_table) [0, 3, 6, 9, 12, 15, 18] >>> print(p.translate(rf_table=full_rf_table, ungap_seq=False)) KFPPWV* >>> p = CodonSeq('AAATTTCCCGGGAA-TTTTAA', rf_table=(0, 3, 6, 9, 14, 17)) >>> print(p.get_full_rf_table()) [0, 3, 6, 9, 12.0, 15, 18] >>> p = CodonSeq('AAA------------TAA', rf_table=(0, 3)) >>> print(p.get_full_rf_table()) [0, 3.0, 6.0, 9.0, 12.0, 15]
- __init__(data='', gap_char='-', rf_table=None)
Initialize the class.
- get_codon(index)
Get the index codon from the sequence.
- get_codon_num()
Return the number of codons in the CodonSeq.
- translate(codon_table=None, stop_symbol='*', rf_table=None, ungap_seq=True)
Translate the CodonSeq based on the reading frame in rf_table.
It is possible for the user to specify a rf_table at this point. If you want to include gaps in the translated sequence, this is the only way. ungap_seq should be set to true for this purpose.
- toSeq()
Convert DNA to seq object.
- get_full_rf_table()
Return full rf_table of the CodonSeq records.
A full rf_table is different from a normal rf_table in that it translate gaps in CodonSeq. It is helpful to construct alignment containing frameshift.
- full_translate(codon_table=None, stop_symbol='*')
Apply full translation with gaps considered.
- ungap(gap='-')
Return a copy of the sequence without the gap character(s).
- classmethod from_seq(seq, rf_table=None)
Get codon sequence from sequence data.
- __abstractmethods__ = frozenset({})
- __annotations__ = {'_data': 'bytes | SequenceDataAbstractBaseClass'}
- Bio.codonalign.codonseq.cal_dn_ds(codon_seq1, codon_seq2, method='NG86', codon_table=None, k=1, cfreq=None)
Calculate dN and dS of the given two sequences.
- Available methods:
NG86 - Nei and Gojobori (1986) (PMID 3444411).
LWL85 - Li et al. (1985) (PMID 3916709).
ML - Goldman and Yang (1994) (PMID 7968486).
YN00 - Yang and Nielsen (2000) (PMID 10666704).
- Arguments:
codon_seq1 - CodonSeq or or SeqRecord that contains a CodonSeq
codon_seq2 - CodonSeq or or SeqRecord that contains a CodonSeq
w - transition/transversion ratio
cfreq - Current codon frequency vector can only be specified when you are using ML method. Possible ways of getting cfreq are: F1x4, F3x4 and F61.