snapatac2.pp.filter_cells#

snapatac2.pp.filter_cells(data, min_counts=1000, min_tsse=5.0, max_counts=None, max_tsse=None, inplace=True)[source]#

Filter cell outliers based on counts and numbers of genes expressed. For instance, only keep cells with at least min_counts counts or min_tsse TSS enrichment scores. This is to filter measurement outliers, i.e. “unreliable” observations.

Parameters:
  • data (AnnData) – The (annotated) data matrix of shape n_obs x n_vars. Rows correspond to cells and columns to regions.

  • min_counts (Optional[int]) – Minimum number of counts required for a cell to pass filtering.

  • min_tsse (Optional[float]) – Minimum TSS enrichemnt score required for a cell to pass filtering.

  • max_counts (Optional[int]) – Maximum number of counts required for a cell to pass filtering.

  • max_tsse (Optional[float]) – Maximum TSS enrichment score expressed required for a cell to pass filtering.

  • inplace (bool) – Perform computation inplace or return result.

Returns:

If inplace = True, directly subsets the data matrix. Otherwise return a boolean index mask that does filtering, where True means that the cell is kept, False means the cell is removed.

Return type:

np.ndarray | None