snapatac2.pp.scrublet#
- snapatac2.pp.scrublet(adata, features='selected', n_comps=15, sim_doublet_ratio=2.0, expected_doublet_rate=0.1, n_neighbors=None, use_approx_neighbors=True, random_state=0, inplace=True, n_jobs=8, verbose=True)[source]#
Compute probability of being a doublet using the scrublet algorithm.
- Parameters:
adata (
AnnData
|list
[AnnData
]) – The (annotated) data matrix of shapen_obs
xn_vars
. Rows correspond to cells and columns to regions.adata
can also be a list of AnnData objects. In this case, the function will be applied to each AnnData object in parallel.features (
UnionType
[str
,ndarray
,None
]) – Boolean index mask, whereTrue
means that the feature is kept, andFalse
means the feature is removed.n_comps (
int
) – Number of PCssim_doublet_ratio (
float
) – Number of doublets to simulate relative to the number of observed cells.expected_doublet_rate (
float
) – Expected doublet rate.n_neighbors (
Optional
[int
]) – Number of neighbors used to construct the KNN graph of observed cells and simulated doublets. IfNone
, this is set to round(0.5 * sqrt(n_cells))use_approx_neighbors – Whether to use approximate search.
random_state (
int
) – Random state.inplace (
bool
) – Whether update the AnnData object inplacen_jobs (
int
) – Number of jobs to run in parallel.verbose (
bool
) – Whether to print progress messages.
- Returns:
- if
inplace = True
, it updates adata with the following fields: adata.obs["doublet_probability"]
: probability of being a doubletadata.obs["doublet_score"]
: doublet score
- if
- Return type:
tuple[np.ndarray, np.ndarray] | None