snapatac2.pp.harmony#

snapatac2.pp.harmony(adata, *, batch, use_rep='X_spectral', use_dims=None, groupby=None, key_added=None, inplace=True, n_jobs=8, **kwargs)[source]#

Use harmonypy to integrate different experiments.

Harmony is an algorithm for integrating single-cell data from multiple experiments. This function uses the python port of Harmony, harmonypy, to integrate single-cell data stored in an AnnData object. This function should be run after performing dimension reduction.

Parameters:
  • adata (AnnData | AnnDataSet | ndarray) – The (annotated) data matrix of shape n_obs x n_vars. Rows correspond to cells and columns to regions.

  • batch (str | list[str]) – The name of the column in adata.obs that differentiates among experiments/batches.

  • use_rep (str) – The name of the field in adata.obsm where the lower dimensional representation is stored.

  • use_dims (int | list[int] | None) – Use these dimensions in use_rep.

  • groupby (str | list[str] | None) – If specified, split the data into groups and perform batch correction on each group separately.

  • key_added (str | None) – If specified, add the result to adata.obsm with this key. Otherwise, it will be stored in adata.obsm[use_rep + "_harmony"].

  • inplace (bool) – Whether to store the result in the anndata object.

  • kwargs – Any additional arguments will be passed to harmonypy.run_harmony().

Returns:

if inplace=True it updates adata with the field adata.obsm[`use_rep`_harmony], containing principal components adjusted by Harmony such that different experiments are integrated. Otherwise, it returns the result as a numpy array.

Return type:

np.ndarray | None