Package Bio :: Package codonalign
[hide private]
[frames] | no frames]

Package codonalign

source code

Code for dealing with Codon Alignments.
Submodules [hide private]

Functions [hide private]
 
build(pro_align, nucl_seqs, corr_dict=None, gap_char='-', unknown='X', codon_table=default_codon_table, alphabet=None, complete_protein=False, anchor_len=10, max_score=10)
Build a codon alignment from protein alignment and corresponding nucleotides.
source code
 
_codons2re(codons)
Generate regular expression based on a given list of codons (PRIVATE).
source code
 
_get_aa_regex(codon_table, stop='*', unknown='X')
Set up the regular expression of a given CodonTable.
source code
 
_check_corr(pro, nucl, gap_char='-', codon_table=default_codon_table, complete_protein=False, anchor_len=10)
Check if the nucleotide can be translated into the protein.
source code
 
_get_shift_anchor_re(sh_anc, sh_nuc, shift_val, aa2re, anchor_len, shift_id_pos)
Find a regular expression matching a potentially shifted anchor.
source code
 
_merge_aa2re(aa1, aa2, shift_val, aa2re, reid)
Merge two amino acids based on detected frame shift value (PRIVATE).
source code
 
_get_codon_rec(pro, nucl, span_mode, alphabet, gap_char='-', codon_table=default_codon_table, complete_protein=False, max_score=10)
Generate codon alignment based on regular re match (PRIVATE).
source code
 
_align_shift_recs(recs)
Build alignment according to the frameshift detected by _check_corr (PRIVATE).
source code
Variables [hide private]
  __package__ = 'Bio.codonalign'
  __warningregistry__ = {('Bio.codonalign is an experimental mod...
Function Details [hide private]

build(pro_align, nucl_seqs, corr_dict=None, gap_char='-', unknown='X', codon_table=default_codon_table, alphabet=None, complete_protein=False, anchor_len=10, max_score=10)

source code 

Build a codon alignment from protein alignment and corresponding nucleotides.

Arguments:
  • pro_align - a protein MultipleSeqAlignment object
  • nucl_align - an object returned by SeqIO.parse or SeqIO.index or a collection of SeqRecord.
  • alphabet - alphabet for the returned codon alignment
  • corr_dict - a dict that maps protein id to nucleotide id
  • complete_protein - whether the sequence begins with a start codon
  • frameshift - whether to apply frameshift detection

Return a CodonAlignment object

>>> from Bio.Alphabet import IUPAC
>>> from Bio.Seq import Seq
>>> from Bio.SeqRecord import SeqRecord
>>> from Bio.Align import MultipleSeqAlignment
>>> seq1 = SeqRecord(Seq('TCAGGGACTGCGAGAACCAAGCTACTGCTGCTGCTGGCTGCGCTCTGCGCCGCAGGTGGGGCGCTGGAG',
...     alphabet=IUPAC.IUPACUnambiguousDNA()), id='pro1')
>>> seq2 = SeqRecord(Seq('TCAGGGACTTCGAGAACCAAGCGCTCCTGCTGCTGGCTGCGCTCGGCGCCGCAGGTGGAGCACTGGAG',
...     alphabet=IUPAC.IUPACUnambiguousDNA()), id='pro2')
>>> pro1 = SeqRecord(Seq('SGTARTKLLLLLAALCAAGGALE', alphabet=IUPAC.protein),id='pro1')
>>> pro2 = SeqRecord(Seq('SGTSRTKRLLLLAALGAAGGALE', alphabet=IUPAC.protein),id='pro2')
>>> aln = MultipleSeqAlignment([pro1, pro2])
>>> codon_aln = build(aln, [seq1, seq2])
>>> print(codon_aln)
CodonAlphabet(Standard) CodonAlignment with 2 rows and 69 columns (23 codons)
TCAGGGACTGCGAGAACCAAGCTACTGCTGCTGCTGGCTGCGCTCTGCGCCGCAGGT...GAG pro1
TCAGGGACTTCGAGAACCAAGCG-CTCCTGCTGCTGGCTGCGCTCGGCGCCGCAGGT...GAG pro2

_check_corr(pro, nucl, gap_char='-', codon_table=default_codon_table, complete_protein=False, anchor_len=10)

source code 

Check if the nucleotide can be translated into the protein.

Expects two SeqRecord objects.

_get_shift_anchor_re(sh_anc, sh_nuc, shift_val, aa2re, anchor_len, shift_id_pos)

source code 

Find a regular expression matching a potentially shifted anchor.

Arguments:
  • sh_anc - shifted anchor sequence
  • sh_nuc - potentially corresponding nucleotide sequence of sh_anc
  • shift_val - 1 or 2 indicates forward frame shift, whereas 3*anchor_len-1 or 3*anchor_len-2 indicates backward shift
  • aa2re - aa to codon re dict
  • anchor_len - length of the anchor
  • shift_id_pos - specify current shift name we are at

_get_codon_rec(pro, nucl, span_mode, alphabet, gap_char='-', codon_table=default_codon_table, complete_protein=False, max_score=10)

source code 

Generate codon alignment based on regular re match (PRIVATE).

span_mode is a tuple returned by _check_corr. The first element is the span of a re search, and the second element is the mode for the match.

mode
  • 0: direct match
  • 1: mismatch (no indels)
  • 2: frameshift

_align_shift_recs(recs)

source code 

Build alignment according to the frameshift detected by _check_corr (PRIVATE).

Argument:
  • recs - a list of SeqRecords containing a CodonSeq dictated by a rf_table (with frameshift in some of them).

Variables Details [hide private]

__warningregistry__

Value:
{('Bio.codonalign is an experimental module which may undergo signific\
ant changes prior to its future official release.',
  <class 'Bio.BiopythonExperimentalWarning'>,
  27): True}