snapatac2.tl.spectral#

snapatac2.tl.spectral(adata, n_comps=30, features='selected', random_state=0, sample_size=None, sample_method='random', chunk_size=20000, distance_metric='cosine', weighted_by_sd=True, feature_weights=None, inplace=True)[source]#

Perform dimension reduction using Laplacian Eigenmaps.

Convert the cell-by-feature count matrix into lower dimensional representations using the spectrum of the normalized graph Laplacian defined by pairwise similarity between cells.

Note

When using the cosine similarity as the similarity metric, the matrix-free spectral embedding algorithm is used, which scales linearly with the number of cells. The memory usage is roughly \(2 imes input_size\). For other types of similarity metrics, the time and space complexity is \(O(N^2)\), where $N$ is the minimum between the total of cells and the sample_size. The memory usage in bytes is given by $N^2 * 8 * 2$. For example, when $N = 10,000$ it will use roughly 745 MB memory. When sample_size is set, the Nystrom algorithm will be used to approximate the embedding.

Parameters:
  • adata (AnnData | AnnDataSet) – AnnData or AnnDataSet object.

  • n_comps (int) – Number of dimensions to keep.

  • features (UnionType[str, ndarray, None]) – Boolean index mask. True means that the feature is kept. False means the feature is removed.

  • random_state (int) – Seed of the random state generator

  • sample_size (Union[int, float, None]) – Sample size used in the Nystrom method. It could be either an integer indicating the number of cells to sample or a real value from 0 to 1 indicating the fraction of cells to sample.

  • chunk_size (int) – Chunk size used in the Nystrom method

  • distance_metric (Literal['jaccard', 'cosine']) – distance metric: “jaccard”, “cosine”.

  • weighted_by_sd (bool) – Whether to weight the result eigenvectors by the square root of eigenvalues.

  • inplace (bool) – Whether to store the result in the anndata object.

Returns:

if inplace=True it stores Spectral embedding of data in adata.obsm["X_spectral"] and adata.uns["spectral_eigenvalue"]. Otherwise, it returns the result as numpy arrays.

Return type:

tuple[np.ndarray, np.ndarray] | None