Package Bio :: Module MarkovModel
[hide private]
[frames] | no frames]

Module MarkovModel

source code

A state-emitting MarkovModel.

Note terminology similar to Manning and Schutze is used.

Functions: train_bw Train a markov model using the Baum-Welch algorithm. train_visible Train a visible markov model using MLE. find_states Find the a state sequence that explains some observations.

load Load a MarkovModel. save Save a MarkovModel.

Classes: MarkovModel Holds the description of a markov model

Classes [hide private]
  MarkovModel
Create a state-emitting MarkovModel object.
Functions [hide private]
 
itemindex(values)
Return a dictionary of values with their sequence offset as keys.
source code
 
_readline_and_check_start(handle, start)
Read the first line and evaluate that begisn with the correct start (PRIVATE).
source code
 
load(handle)
Parse a file handle into a MarkovModel object.
source code
 
save(mm, handle)
Save MarkovModel object into handle.
source code
 
train_bw(states, alphabet, training_data, pseudo_initial=None, pseudo_transition=None, pseudo_emission=None, update_fn=None)
Train a MarkovModel using the Baum-Welch algorithm.
source code
 
_baum_welch(N, M, training_outputs, p_initial=None, p_transition=None, p_emission=None, pseudo_initial=None, pseudo_transition=None, pseudo_emission=None, update_fn=None)
Implement the Baum-Welch algorithm to evaluate unknown parameters in the MarkovModel object (PRIVATE).
source code
 
_baum_welch_one(N, M, outputs, lp_initial, lp_transition, lp_emission, lpseudo_initial, lpseudo_transition, lpseudo_emission)
Execute one step for Baum-Welch algorithm (PRIVATE).
source code
 
_forward(N, T, lp_initial, lp_transition, lp_emission, outputs)
Implement forward algorithm (PRIVATE).
source code
 
_backward(N, T, lp_transition, lp_emission, outputs)
Implement backward algorithm.
source code
 
train_visible(states, alphabet, training_data, pseudo_initial=None, pseudo_transition=None, pseudo_emission=None)
Train a visible MarkovModel using maximum likelihoood estimates for each of the parameters.
source code
 
_mle(N, M, training_outputs, training_states, pseudo_initial, pseudo_transition, pseudo_emission)
Implement Maximum likelihood estimation algorithm (PRIVATE).
source code
 
_argmaxes(vector, allowance=None)
Return indeces of the maximum values aong the vector (PRIVATE).
source code
 
find_states(markov_model, output)
Find states in the given Markov model output.
source code
 
_viterbi(N, lp_initial, lp_transition, lp_emission, output)
Implement Viterbi algorithm to find most likely states for a given input (PRIVATE).
source code
 
_normalize(matrix)
Normalize matrix object (PRIVATE).
source code
 
_uniform_norm(shape)
Normalize a uniform matrix (PRIVATE).
source code
 
_random_norm(shape)
Normalize a random matrix (PRIVATE).
source code
 
_copy_and_check(matrix, desired_shape)
Copy a matrix and check its dimension. Normalize at the end (PRIVATE).
source code
 
_logsum(matrix)
Implement logsum for a matrix object (PRIVATE).
source code
 
_logvecadd(logvec1, logvec2)
Implement a log sum for two vector objects (PRIVATE).
source code
 
_exp_logsum(numbers)
Return the exponential of a logsum (PRIVATE).
source code
Variables [hide private]
  logaddexp = <ufunc 'logaddexp'>
  VERY_SMALL_NUMBER = 1e-300
  LOG0 = -690.77552789821368
  MAX_ITERATIONS = 1000
  __package__ = 'Bio'
Function Details [hide private]

train_bw(states, alphabet, training_data, pseudo_initial=None, pseudo_transition=None, pseudo_emission=None, update_fn=None)

source code 

Train a MarkovModel using the Baum-Welch algorithm.

Train a MarkovModel using the Baum-Welch algorithm. states is a list of strings that describe the names of each state. alphabet is a list of objects that indicate the allowed outputs. training_data is a list of observations. Each observation is a list of objects from the alphabet.

pseudo_initial, pseudo_transition, and pseudo_emission are optional parameters that you can use to assign pseudo-counts to different matrices. They should be matrices of the appropriate size that contain numbers to add to each parameter matrix, before normalization.

update_fn is an optional callback that takes parameters (iteration, log_likelihood). It is called once per iteration.

_baum_welch_one(N, M, outputs, lp_initial, lp_transition, lp_emission, lpseudo_initial, lpseudo_transition, lpseudo_emission)

source code 

Execute one step for Baum-Welch algorithm (PRIVATE).

Do one iteration of Baum-Welch based on a sequence of output. Changes the value for lp_initial, lp_transition and lp_emission in place.

_forward(N, T, lp_initial, lp_transition, lp_emission, outputs)

source code 

Implement forward algorithm (PRIVATE).

Calculate a Nx(T+1) matrix, where the last column is the total probability of the output.

train_visible(states, alphabet, training_data, pseudo_initial=None, pseudo_transition=None, pseudo_emission=None)

source code 

Train a visible MarkovModel using maximum likelihoood estimates for each of the parameters.

Train a visible MarkovModel using maximum likelihoood estimates for each of the parameters. states is a list of strings that describe the names of each state. alphabet is a list of objects that indicate the allowed outputs. training_data is a list of (outputs, observed states) where outputs is a list of the emission from the alphabet, and observed states is a list of states from states.

pseudo_initial, pseudo_transition, and pseudo_emission are optional parameters that you can use to assign pseudo-counts to different matrices. They should be matrices of the appropriate size that contain numbers to add to each parameter matrix.

find_states(markov_model, output)

source code 

Find states in the given Markov model output.

Returns a list of (states, score) tuples.