Package Bio :: Module MaxEntropy
[hide private]
[frames] | no frames]

Module MaxEntropy

source code

Maximum Entropy code.

Uses Improved Iterative Scaling.

Classes [hide private]
  MaxEntropy
Holds information for a Maximum Entropy classifier.
Functions [hide private]
list of log probs

calculate(me, observation)
Calculate the log of the probability for each class.
source code
class

classify(me, observation)
Classify an observation into a class.
source code
dict of values

_eval_feature_fn(fn, xs, classes)
Evaluate a feature function on every instance of the training set and class.
source code
list of expectations

_calc_empirical_expects(xs, ys, classes, features)
Calculate the expectation of each function from the data.
source code
list of expectations

_calc_model_expects(xs, classes, features, alphas)
Calculate the expectation of each feature from the model.
source code
matrix

_calc_p_class_given_x(xs, classes, features, alphas)
Calculate P(y|x), where y is the class and x is an instance from the training set.
source code
matrix of f sharp values.

_calc_f_sharp(N, nclasses, features) source code
 
_iis_solve_delta(N, feature, f_sharp, empirical, prob_yx, max_newton_iterations, newton_converge) source code
 
_train_iis(xs, classes, features, f_sharp, alphas, e_empirical, max_newton_iterations, newton_converge)
Do one iteration of hill climbing to find better alphas (PRIVATE).
source code
 
train(training_set, results, feature_fns, update_fn=None, max_iis_iterations=10000, iis_converge=1e-05, max_newton_iterations=100, newton_converge=1e-10)
Train a maximum entropy classifier, returns MaxEntropy object.
source code
Variables [hide private]
  __package__ = 'Bio'
Function Details [hide private]

calculate(me, observation)

source code 
Calculate the log of the probability for each class.  me is a
MaxEntropy object that has been trained.  observation is a vector
representing the observed data.  The return value is a list of
unnormalized log probabilities for each class.

Returns:
list of log probs

_eval_feature_fn(fn, xs, classes)

source code 
Evaluate a feature function on every instance of the training set
and class.  fn is a callback function that takes two parameters: a
training instance and a class.  Return a dictionary of (training
set index, class index) -> non-zero value.  Values of 0 are not
stored in the dictionary.

Returns:
dict of values

_calc_empirical_expects(xs, ys, classes, features)

source code 
Calculate the expectation of each function from the data.  This is
the constraint for the maximum entropy distribution.  Return a
list of expectations, parallel to the list of features.

Returns:
list of expectations

_calc_model_expects(xs, classes, features, alphas)

source code 
Calculate the expectation of each feature from the model.  This is
not used in maximum entropy training, but provides a good function
for debugging.

Returns:
list of expectations

_calc_p_class_given_x(xs, classes, features, alphas)

source code 
Calculate P(y|x), where y is the class and x is an instance from
the training set.  Return a XSxCLASSES matrix of probabilities.

Returns:
matrix

train(training_set, results, feature_fns, update_fn=None, max_iis_iterations=10000, iis_converge=1e-05, max_newton_iterations=100, newton_converge=1e-10)

source code 
Train a maximum entropy classifier, returns MaxEntropy object.

Train a maximum entropy classifier on a training set.
training_set is a list of observations.  results is a list of the
class assignments for each observation.  feature_fns is a list of
the features.  These are callback functions that take an
observation and class and return a 1 or 0.  update_fn is a
callback function that is called at each training iteration.  It is
passed a MaxEntropy object that encapsulates the current state of
the training.

The maximum number of iterations and the convergence criterion for IIS
are given by max_iis_iterations and iis_converge, respectively, while
max_newton_iterations and newton_converge are the maximum number
of iterations and the convergence criterion for Newton's method.