Package Bio :: Package HMM :: Module Trainer :: Class BaumWelchTrainer
[hide private]
[frames] | no frames]

Class BaumWelchTrainer

source code

     object --+    
              |    
AbstractTrainer --+
                  |
                 BaumWelchTrainer

Trainer that uses the Baum-Welch algorithm to estimate parameters.

These should be used when a training sequence for an HMM has unknown
paths for the actual states, and you need to make an estimation of the
model parameters from the observed emissions.

This uses the Baum-Welch algorithm, first described in
Baum, L.E. 1972. Inequalities. 3:1-8
This is based on the description in 'Biological Sequence Analysis' by
Durbin et al. in section 3.3

This algorithm is guaranteed to converge to a local maximum, but not
necessarily to the global maxima, so use with care!

Instance Methods [hide private]
 
__init__(self, markov_model)
Initialize the trainer.
source code
 
train(self, training_seqs, stopping_criteria, dp_method=<class 'Bio.HMM.DynamicProgramming.ScaledDPAlgorithms'>)
Estimate the parameters using training sequences.
source code
 
update_transitions(self, transition_counts, training_seq, forward_vars, backward_vars, training_seq_prob)
Add the contribution of a new training sequence to the transitions.
source code
 
update_emissions(self, emission_counts, training_seq, forward_vars, backward_vars, training_seq_prob)
Add the contribution of a new training sequence to the emissions
source code

Inherited from AbstractTrainer: estimate_params, log_likelihood, ml_estimator

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, markov_model)
(Constructor)

source code 
Initialize the trainer.

Arguments:

o markov_model - The model we are going to estimate parameters for.
This should have the parameters with some initial estimates, that
we can build from.

Overrides: object.__init__

train(self, training_seqs, stopping_criteria, dp_method=<class 'Bio.HMM.DynamicProgramming.ScaledDPAlgorithms'>)

source code 
Estimate the parameters using training sequences.

The algorithm for this is taken from Durbin et al. p64, so this
is a good place to go for a reference on what is going on.

Arguments:

o training_seqs -- A list of TrainingSequence objects to be used
for estimating the parameters.

o stopping_criteria -- A function, that when passed the change
in log likelihood and threshold, will indicate if we should stop
the estimation iterations.

o dp_method -- A class instance specifying the dynamic programming
implementation we should use to calculate the forward and
backward variables. By default, we use the scaling method.

update_transitions(self, transition_counts, training_seq, forward_vars, backward_vars, training_seq_prob)

source code 
Add the contribution of a new training sequence to the transitions.

Arguments:

o transition_counts -- A dictionary of the current counts for the
transitions

o training_seq -- The training sequence we are working with

o forward_vars -- Probabilities calculated using the forward
algorithm.

o backward_vars -- Probabilities calculated using the backwards
algorithm.

o training_seq_prob - The probability of the current sequence.

This calculates A_{kl} (the estimated transition counts from state
k to state l) using formula 3.20 in Durbin et al.

update_emissions(self, emission_counts, training_seq, forward_vars, backward_vars, training_seq_prob)

source code 
Add the contribution of a new training sequence to the emissions

Arguments:

o emission_counts -- A dictionary of the current counts for the
emissions

o training_seq -- The training sequence we are working with

o forward_vars -- Probabilities calculated using the forward
algorithm.

o backward_vars -- Probabilities calculated using the backwards
algorithm.

o training_seq_prob - The probability of the current sequence.

This calculates E_{k}(b) (the estimated emission probability for
emission letter b from state k) using formula 3.21 in Durbin et al.