Code for doing k-nearest-neighbors classification.
k Nearest Neighbors is a supervised learning algorithm that classifies a new observation based the classes in its surrounding neighborhood.
distance The distance between two points in the feature space.
weight The importance given to each point for classification.
kNN Holds information for a nearest neighbors classifier.
train Train a new kNN classifier.
calculate Calculate the probabilities of each class, given an observation.
classify Classify an observation into a class.
- Weighting Functions:
equal_weight Every example is given a weight of 1.
- class Bio.kNN.kNN¶
Holds information necessary to do nearest neighbors classification.
classes Set of the possible classes.
xs List of the neighbors.
ys List of the classes that the neighbors belong to.
k Number of neighbors to look at.
Initialize the class.
- Bio.kNN.equal_weight(x, y)¶
Return integer one (dummy method for equally weighting).
- Bio.kNN.train(xs, ys, k, typecode=None)¶
Train a k nearest neighbors classifier on a training set.
xs is a list of observations and ys is a list of the class assignments. Thus, xs and ys should contain the same number of elements. k is the number of neighbors that should be examined when doing the classification.
- Bio.kNN.calculate(knn, x, weight_fn=None, distance_fn=None)¶
Calculate the probability for each class.
x is the observed data.
weight_fn is an optional function that takes x and a training example, and returns a weight.
distance_fn is an optional function that takes two points and returns the distance between them. If distance_fn is None (the default), the Euclidean distance is used.
Returns a dictionary of the class to the weight given to the class.
- Bio.kNN.classify(knn, x, weight_fn=None, distance_fn=None)¶
Classify an observation into a class.
If not specified, weight_fn will give all neighbors equal weight. distance_fn is an optional function that takes two points and returns the distance between them. If distance_fn is None (the default), the Euclidean distance is used.