Package Bio :: Package Phylo :: Module TreeConstruction :: Class DistanceCalculator
[hide private]
[frames] | no frames]

Class DistanceCalculator

source code

object --+
         |
        DistanceCalculator

Class to calculate the distance matrix from a DNA or Protein.

Multiple Sequence Alignment(MSA) and the given name of the substitution model.

Currently only scoring matrices are used.

Examples

Loading a small PHYLIP alignment from which to compute distances:

from Bio.Phylo.TreeConstruction import DistanceCalculator
from Bio import AlignIO
aln = AlignIO.read(open('TreeConstruction/msa.phy'), 'phylip')
print(aln)

Output:

SingleLetterAlphabet() alignment with 5 rows and 13 columns
AACGTGGCCACAT Alpha
AAGGTCGCCACAC Beta
CAGTTCGCCACAA Gamma
GAGATTTCCGCCT Delta
GAGATCTCCGCCC Epsilon

DNA calculator with 'identity' model:

calculator = DistanceCalculator('identity')
dm = calculator.get_distance(aln)
print(dm)

Output:

Alpha   0
Beta    0.23076923076923073     0
Gamma   0.3846153846153846      0.23076923076923073     0
Delta   0.5384615384615384      0.5384615384615384      0.5384615384615384      0
Epsilon 0.6153846153846154      0.3846153846153846      0.46153846153846156     0.15384615384615385     0
    Alpha       Beta    Gamma   Delta   Epsilon

Protein calculator with 'blosum62' model:

calculator = DistanceCalculator('blosum62')
dm = calculator.get_distance(aln)
print(dm)

Output:

Alpha   0
Beta    0.36904761904761907     0
Gamma   0.49397590361445787     0.25    0
Delta   0.5853658536585367      0.5476190476190477      0.5662650602409638      0
Epsilon 0.7     0.3555555555555555      0.48888888888888893     0.2222222222222222      0
    Alpha       Beta    Gamma   Delta   Epsilon
Instance Methods [hide private]
 
__init__(self, model='identity', skip_letters=None)
Initialize with a distance model.
source code
 
_pairwise(self, seq1, seq2)
Calculate pairwise distance from two sequences (PRIVATE).
source code
 
get_distance(self, msa)
Return a DistanceMatrix for MSA object.
source code
 
_build_protein_matrix(self, subsmat)
Convert matrix from SubsMat format to _Matrix object (PRIVATE).
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]
  dna_alphabet = ['A', 'T', 'C', 'G']
  blastn = [[5], [-4, 5], [-4, -4, 5], [-4, -4, -4, 5]]
  trans = [[6], [-5, 6], [-5, -1, 6], [-1, -5, -5, 6]]
  protein_alphabet = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I...
  dna_matrices = {'blastn': [[5], [-4, 5], [-4, -4, 5], [-4, -4,...
  protein_models = ['benner6', 'benner22', 'benner74', 'blosum10...
  protein_matrices = {'benner22': {('A', 'A'): 2.5, ('A', 'C'): ...
  dna_models = ['blastn', 'trans']
  models = ['identity', 'blastn', 'trans', 'benner6', 'benner22'...
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, model='identity', skip_letters=None)
(Constructor)

source code 
Initialize with a distance model.
Parameters:
  • model (str) - Name of the model matrix to be used to calculate distance. The attribute dna_matrices contains the available model names for DNA sequences and protein_matrices for protein sequences.
Overrides: object.__init__

_pairwise(self, seq1, seq2)

source code 

Calculate pairwise distance from two sequences (PRIVATE).

Returns a value between 0 (identical sequences) and 1 (completely different, or seq1 is an empty string.)

get_distance(self, msa)

source code 
Return a DistanceMatrix for MSA object.
Parameters:
  • msa (MultipleSeqAlignment) - DNA or Protein multiple sequence alignment.

Class Variable Details [hide private]

protein_alphabet

Value:
['A',
 'B',
 'C',
 'D',
 'E',
 'F',
 'G',
 'H',
...

dna_matrices

Value:
{'blastn': [[5], [-4, 5], [-4, -4, 5], [-4, -4, -4, 5]],
 'trans': [[6], [-5, 6], [-5, -1, 6], [-1, -5, -5, 6]]}

protein_models

Value:
['benner6',
 'benner22',
 'benner74',
 'blosum100',
 'blosum30',
 'blosum35',
 'blosum40',
 'blosum45',
...

protein_matrices

Value:
{'benner22': {('A', 'A'): 2.5,
              ('A', 'C'): -1.2,
              ('A', 'P'): 0.8,
              ('A', 'S'): 1.3,
              ('A', 'T'): 1.4,
              ('C', 'C'): 12.6,
              ('D', 'A'): -0.2,
              ('D', 'C'): -3.7,
...

models

Value:
['identity',
 'blastn',
 'trans',
 'benner6',
 'benner22',
 'benner74',
 'blosum100',
 'blosum30',
...