treecluster(data,
mask=None,
weight=None,
transpose=False,
method=' m ' ,
dist=' e ' ,
distancematrix=None)
 source code

Perform hierarchical clustering, and return a Tree object.
This function implements the pairwise single, complete, centroid, and
average linkage hierarchical clustering methods.
 Keyword arguments:
data: nrows x ncolumns array containing the data values.
mask: nrows x ncolumns array of integers, showing which data are
missing. If mask[i][j]==0, then data[i][j] is missing.
weight: the weights to be used when calculating distances.
transpose:
 if False, rows are clustered;
 if True, columns are clustered.
dist: specifies the distance function to be used:
 dist == 'e': Euclidean distance
 dist == 'b': City Block distance
 dist == 'c': Pearson correlation
 dist == 'a': absolute value of the correlation
 dist == 'u': uncentered correlation
 dist == 'x': absolute uncentered correlation
 dist == 's': Spearman's rank correlation
 dist == 'k': Kendall's tau
method: specifies which linkage method is used:
 method == 's': Single pairwise linkage
 method == 'm': Complete (maximum) pairwise linkage (default)
 method == 'c': Centroid linkage
 method == 'a': Average pairwise linkage
distancematrix: The distance matrix between the items. There are
three ways in which you can pass a distance matrix:
1. a 2D Numerical Python array (in which only the leftlower
part of the array will be accessed);
2. a 1D Numerical Python array containing the distances
consecutively;
3. a list of rows containing the lowertriangular part of
the distance matrix.
Examples are:
>>>
>>> distance = array([[0.0, 1.1, 2.3],
... [1.1, 0.0, 4.5],
... [2.3, 4.5, 0.0]])
...
>>>
>>> distance = array([1.1, 2.3, 4.5])
>>>
>>> distance = [array([]),
... array([1.1]),
... array([2.3, 4.5])]
...
>>>
These three correspond to the same distance matrix.
PLEASE NOTE:
As the treecluster routine may shuffle the values in the
distance matrix as part of the clustering algorithm, be sure
to save this array in a different variable before calling
treecluster if you need it later.
Either data or distancematrix should be None. If distancematrix is None,
the hierarchical clustering solution is calculated from the values stored
in the argument data. If data is None, the hierarchical clustering solution
is instead calculated from the distance matrix. Pairwise centroidlinkage
clustering can be performed only from the data values and not from the
distance matrix. Pairwise single, maximum, and averagelinkage clustering
can be calculated from the data values or from the distance matrix.
Return value:
treecluster returns a Tree object describing the hierarchical clustering
result. See the description of the Tree class for more information.
