snapatac2.pp.select_features#

snapatac2.pp.select_features(adata, min_cells=1, most_variable=1000000, whitelist=None, blacklist=None, inplace=True)[source]#

Perform feature selection.

Note

This function does not perform the actual subsetting. The feature mask is used by various functions to generate submatrices on the fly.

Parameters
  • adata (AnnData | AnnDataSet) – The (annotated) data matrix of shape n_obs x n_vars. Rows correspond to cells and columns to regions.

  • min_cells (int) – Minimum number of cells.

  • most_variable (UnionType[int, float, None]) – If None, do not perform feature selection using most variable features

  • whitelist (Optional[Path]) – A user provided bed file containing genome-wide whitelist regions. Features that are overlapped with these regions will be retained.

  • blacklist (Optional[Path]) – A user provided bed file containing genome-wide blacklist regions. Features that are overlapped with these regions will be removed.

  • inplace (bool) – Perform computation inplace or return result.

Returns

If inplace = False, return a boolean index mask that does filtering, where True means that the feature is kept, False means the feature is removed. Otherwise, store this index mask directly to .var['selected'].

Return type

np.ndarray | None