Chromatin clustering

prody.chromatin.cluster.BayesianGaussianMixture(V, **kwargs)[source]

Performs clustering on V by using Gaussian mixture models with variational inference. The function uses sklearn.micture.GaussianMixture(). See sklearn documents for details.

Parameters:
  • V (ndarray) – row-normalized eigenvectors for the purpose of clustering.

  • n_clusters (int) – specifies the number of clusters.

prody.chromatin.cluster.Discretize(V, **kwargs)[source]

Adapted from discretize(). Copyright please see LICENSE.rst.

prody.chromatin.cluster.GaussianMixture(V, **kwargs)[source]

Performs clustering on V by using Gaussian mixture models. The function uses sklearn.micture.GaussianMixture(). See sklearn documents for details.

Parameters:
  • V (ndarray) – row-normalized eigenvectors for the purpose of clustering.

  • n_clusters (int) – specifies the number of clusters.

prody.chromatin.cluster.Hierarchy(V, **kwargs)[source]

Performs hierarchical clustering on V. The function essentially uses two scipy functions: linkage and fcluster. See scipy.cluster.hierarchy.linkage() and scipy.cluster.hierarchy.fcluster() for the explaination of the arguments. Here lists arguments that are different from those of scipy.

Parameters:

is inconsistent and threshold t is not given (default), then the function will use the percentile specified by this argument as the threshold. :type inconsistent_percentile: double

Parameters:

n_clusters – specifies the maximal number of clusters. If this argument is given, then the function will

automatically set criterion to maxclust and t equal to n_clusters. :type n_clusters: int

prody.chromatin.cluster.KMeans(V, **kwargs)[source]

Performs k-means clustering on V. The function uses sklearn.cluster.KMeans(). See sklearn documents for details.

Parameters:
  • V (ndarray) – row-normalized eigenvectors for the purpose of clustering.

  • n_clusters (int) – specifies the number of clusters.

prody.chromatin.cluster.calcGNMDomains(modes, method=<function Discretize>, **kwargs)[source]

Uses spectral clustering to separate structural domains in chromosomes and proteins.

Parameters:
  • modes (ModeSet) – GNM modes used for segmentation

  • method (func) – Label assignment algorithm used after Laplacian embedding of loci.

prody.chromatin.cluster.showLinkage(V, **kwargs)[source]

Shows the dendrogram of hierarchical clustering on V. See scipy.cluster.hierarchy.dendrogram() for details.

Parameters:

V (ndarray) – row-normalized eigenvectors for the purpose of clustering.