Package Bio :: Package Phylo :: Module TreeConstruction :: Class DistanceCalculator
[hide private]
[frames] | no frames]

Class DistanceCalculator

source code

object --+
         |
        DistanceCalculator

Class to calculate the distance matrix from a DNA or Protein
Multiple Sequence Alignment(MSA) and the given name of the
substitution model.

Currently only scoring matrices are used.

:Parameters:
    model : str
        Name of the model matrix to be used to calculate distance.
        The attribute `dna_matrices` contains the available model
        names for DNA sequences and `protein_matrices` for protein
        sequences.

Example
-------

>>> from Bio.Phylo.TreeConstruction import DistanceCalculator
>>> from Bio import AlignIO
>>> aln = AlignIO.read(open('Tests/TreeConstruction/msa.phy'), 'phylip')
>>> print aln
SingleLetterAlphabet() alignment with 5 rows and 13 columns
AACGTGGCCACAT Alpha
AAGGTCGCCACAC Beta
GAGATTTCCGCCT Delta
GAGATCTCCGCCC Epsilon
CAGTTCGCCACAA Gamma

DNA calculator with 'identity' model:

>>> calculator = DistanceCalculator('identity')
>>> dm = calculator.get_distance(aln)
>>> print dm
Alpha   0
Beta    0.230769230769  0
Gamma   0.384615384615  0.230769230769  0
Delta   0.538461538462  0.538461538462  0.538461538462  0
Epsilon 0.615384615385  0.384615384615  0.461538461538  0.153846153846  0
        Alpha           Beta            Gamma           Delta           Epsilon

Protein calculator with 'blosum62' model:
>>> calculator = DistanceCalculator('blosum62')
>>> dm = calculator.get_distance(aln)
>>> print dm
Alpha   0
Beta    0.369047619048  0
Gamma   0.493975903614  0.25            0
Delta   0.585365853659  0.547619047619  0.566265060241  0
Epsilon 0.7             0.355555555556  0.488888888889  0.222222222222  0
        Alpha           Beta            Gamma           Delta           Epsilon

Instance Methods [hide private]
 
__init__(self, model='identity')
Initialize with a distance model
source code
 
_pairwise(self, seq1, seq2)
Calculate pairwise distance from two sequences
source code
 
get_distance(self, msa)
Return a _DistanceMatrix for MSA object
source code
 
_build_protein_matrix(self, subsmat)
Convert matrix from SubsMat format to _Matrix object
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]
  dna_alphabet = ['A', 'T', 'C', 'G']
  blastn = [[5], [-4, 5], [-4, -4, 5], [-4, -4, -4, 5]]
  trans = [[6], [-5, 6], [-5, -1, 6], [-1, -5, -5, 6]]
  protein_alphabet = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I...
  dna_matrices = {'blastn': [[5], [-4, 5], [-4, -4, 5], [-4, -4,...
  protein_models = ['benner6', 'benner22', 'benner74', 'blosum10...
  protein_matrices = {'benner22': {('A', 'A'): 2.5, ('A', 'C'): ...
  dna_models = ['blastn', 'trans']
  models = ['identity', 'blastn', 'trans', 'benner6', 'benner22'...
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, model='identity')
(Constructor)

source code 
Initialize with a distance model
Overrides: object.__init__

get_distance(self, msa)

source code 
Return a _DistanceMatrix for MSA object
Parameters:
  • msa (MultipleSeqAlignment) - DNA or Protein multiple sequence alignment.

Class Variable Details [hide private]

protein_alphabet

Value:
['A',
 'B',
 'C',
 'D',
 'E',
 'F',
 'G',
 'H',
...

dna_matrices

Value:
{'blastn': [[5], [-4, 5], [-4, -4, 5], [-4, -4, -4, 5]],
 'trans': [[6], [-5, 6], [-5, -1, 6], [-1, -5, -5, 6]]}

protein_models

Value:
['benner6',
 'benner22',
 'benner74',
 'blosum100',
 'blosum30',
 'blosum35',
 'blosum40',
 'blosum45',
...

protein_matrices

Value:
{'benner22': {('A', 'A'): 2.5,
              ('A', 'C'): -1.2,
              ('A', 'P'): 0.8,
              ('A', 'S'): 1.3,
              ('A', 'T'): 1.4,
              ('C', 'C'): 12.6,
              ('D', 'A'): -0.2,
              ('D', 'C'): -3.7,
...

models

Value:
['identity',
 'blastn',
 'trans',
 'benner6',
 'benner22',
 'benner74',
 'blosum100',
 'blosum30',
...