Chromatin clustering

prody.chromatin.cluster.BayesianGaussianMixture(V, **kwargs)[source]

Performs clustering on V by using Gaussian mixture models with variational inference. The function uses sklearn.micture.GaussianMixture(). See sklearn documents for details.

Parameters:

V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
n_clusters (int) – specifies the number of clusters.

prody.chromatin.cluster.Discretize(V, **kwargs)[source]: Adapted from discretize(). Copyright please see LICENSE.rst.

prody.chromatin.cluster.GaussianMixture(V, **kwargs)[source]

Performs clustering on V by using Gaussian mixture models. The function uses sklearn.micture.GaussianMixture(). See sklearn documents for details.

Parameters:

V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
n_clusters (int) – specifies the number of clusters.

prody.chromatin.cluster.Hierarchy(V, **kwargs)[source]

Performs hierarchical clustering on V. The function essentially uses two scipy functions: linkage and fcluster. See scipy.cluster.hierarchy.linkage() and scipy.cluster.hierarchy.fcluster() for the explaination of the arguments. Here lists arguments that are different from those of scipy.

Parameters:

V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
inconsistent_percentile – if the clustering criterion for scipy.cluster.hierarchy.fcluster()

is inconsistent and threshold t is not given (default), then the function will use the percentile specified by this argument as the threshold. :type inconsistent_percentile: double

Parameters:: n_clusters – specifies the maximal number of clusters. If this argument is given, then the function will

automatically set criterion to maxclust and t equal to n_clusters. :type n_clusters: int

prody.chromatin.cluster.KMeans(V, **kwargs)[source]

Performs k-means clustering on V. The function uses sklearn.cluster.KMeans(). See sklearn documents for details.

Parameters:

V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
n_clusters (int) – specifies the number of clusters.

prody.chromatin.cluster.calcGNMDomains(modes, method=<function Discretize>, **kwargs)[source]

Uses spectral clustering to separate structural domains in chromosomes and proteins.

Parameters:

modes (ModeSet) – GNM modes used for segmentation
method (func) – Label assignment algorithm used after Laplacian embedding of loci.

prody.chromatin.cluster.showLinkage(V, **kwargs)[source]

Shows the dendrogram of hierarchical clustering on V. See scipy.cluster.hierarchy.dendrogram() for details.

Parameters:: V (ndarray) – row-normalized eigenvectors for the purpose of clustering.