snapatac2.tl.kmeans#

snapatac2.tl.kmeans(adata, n_clusters, n_iterations=-1, random_state=0, use_rep='X_spectral', key_added='kmeans', inplace=True)[source]#

Cluster cells into subgroups using the K-means algorithm, a classical algorithm in data mining.

Parameters:

adata (AnnData | AnnDataSet | ndarray) – The annotated data matrix.
n_clusters (int) – Number of clusters to return.
n_iterations (int) – How many iterations of the kmeans clustering algorithm to perform. Positive values above 2 define the total number of iterations to perform, -1 has the algorithm run until it reaches its optimal clustering.
random_state (int) – Change the initialization of the optimization.
use_rep (str) – Which data in adata.obsm to use for clustering. Default is “X_spectral”.
key_added (str) – adata.obs key under which to add the cluster labels.

Return type:

ndarray | None

Returns:

adds fields to adata
adata.obs[key_added] – Array of dim (number of samples) that stores the subgroup id ('0', '1', …) for each cell.
adata.uns['kmeans']['params'] – A dict with the values for the parameters n_clusters, random_state, and n_iterations.