Bio.MarkovModel module
A state-emitting MarkovModel.
Note terminology similar to Manning and Schutze is used.
Functions: train_bw Train a markov model using the Baum-Welch algorithm. train_visible Train a visible markov model using MLE. find_states Find the a state sequence that explains some observations.
load Load a MarkovModel. save Save a MarkovModel.
Classes: MarkovModel Holds the description of a markov model
- Bio.MarkovModel.itemindex(values)
Return a dictionary of values with their sequence offset as keys.
- class Bio.MarkovModel.MarkovModel(states, alphabet, p_initial=None, p_transition=None, p_emission=None)
Bases:
object
Create a state-emitting MarkovModel object.
- __init__(states, alphabet, p_initial=None, p_transition=None, p_emission=None)
Initialize the class.
- __str__()
Create a string representation of the MarkovModel object.
- Bio.MarkovModel.load(handle)
Parse a file handle into a MarkovModel object.
- Bio.MarkovModel.save(mm, handle)
Save MarkovModel object into handle.
- Bio.MarkovModel.train_bw(states, alphabet, training_data, pseudo_initial=None, pseudo_transition=None, pseudo_emission=None, update_fn=None)
Train a MarkovModel using the Baum-Welch algorithm.
Train a MarkovModel using the Baum-Welch algorithm. states is a list of strings that describe the names of each state. alphabet is a list of objects that indicate the allowed outputs. training_data is a list of observations. Each observation is a list of objects from the alphabet.
pseudo_initial, pseudo_transition, and pseudo_emission are optional parameters that you can use to assign pseudo-counts to different matrices. They should be matrices of the appropriate size that contain numbers to add to each parameter matrix, before normalization.
update_fn is an optional callback that takes parameters (iteration, log_likelihood). It is called once per iteration.
- Bio.MarkovModel.train_visible(states, alphabet, training_data, pseudo_initial=None, pseudo_transition=None, pseudo_emission=None)
Train a visible MarkovModel using maximum likelihoood estimates for each of the parameters.
Train a visible MarkovModel using maximum likelihoood estimates for each of the parameters. states is a list of strings that describe the names of each state. alphabet is a list of objects that indicate the allowed outputs. training_data is a list of (outputs, observed states) where outputs is a list of the emission from the alphabet, and observed states is a list of states from states.
pseudo_initial, pseudo_transition, and pseudo_emission are optional parameters that you can use to assign pseudo-counts to different matrices. They should be matrices of the appropriate size that contain numbers to add to each parameter matrix.
- Bio.MarkovModel.find_states(markov_model, output)
Find states in the given Markov model output.
Returns a list of (states, score) tuples.