Bio.codonalign package


Module contents

Code for dealing with Codon Alignments., nucl_seqs, corr_dict=None, gap_char='-', unknown='X', codon_table=None, complete_protein=False, anchor_len=10, max_score=10)

Build a codon alignment from protein alignment and corresponding nucleotides.

  • pro_align - a protein MultipleSeqAlignment object

  • nucl_seqs - an object returned by SeqIO.parse or SeqIO.index or a collection of SeqRecord.

  • corr_dict - a dict that maps protein id to nucleotide id

  • complete_protein - whether the sequence begins with a start codon

Return a CodonAlignment object.

The example below answers this Biostars question:

>>> from Bio.Seq import Seq
>>> from Bio.SeqRecord import SeqRecord
>>> from Bio.Align import MultipleSeqAlignment
>>> from Bio.codonalign import build
>>> seq1 = SeqRecord(Seq('ATGTCTCGT'), id='pro1')
>>> seq2 = SeqRecord(Seq('ATGCGT'), id='pro2')
>>> pro1 = SeqRecord(Seq('MSR'), id='pro1')
>>> pro2 = SeqRecord(Seq('M-R'), id='pro2')
>>> aln = MultipleSeqAlignment([pro1, pro2])
>>> codon_aln = build(aln, [seq1, seq2])
>>> print(codon_aln)
CodonAlignment with 2 rows and 9 columns (3 codons)
ATG---CGT pro2