snapatac2.metrics.frip#
- snapatac2.metrics.frip(adata, regions, *, normalized=True, count_as_insertion=False, inplace=True, n_jobs=8)[source]#
Add fraction of reads in peaks (FRiP) to the AnnData object.
import_data
must be ran first in order to use this function.- Parameters:
adata (
AnnData
|list
[AnnData
]) – The (annotated) data matrix of shapen_obs
xn_vars
. Rows correspond to cells and columns to regions.adata
could also be a list of AnnData objects. In this case, the function will be applied to each AnnData object in parallel.regions (
dict
[str
,Path
|list
[str
]]) – A dictionary containing the peak sets to compute FRiP. The keys are peak set names and the values are either a bed file name or a list of strings representing genomic regions. For example,{"promoter_frac": "promoter.bed", "enhancer_frac": ["chr1:100-200", "chr2:300-400"]}
.normalized (
bool
) – Whether to normalize the counts by the total number of fragments. If False, the raw number of fragments in peaks will be returned.count_as_insertion (
bool
) – Whether to count transposition events instead of fragments. Transposition events are located at both ends of fragments.inplace (
bool
) – Whether to add the results toadata.obs
or return it as a dictionary.n_jobs (
int
) – Number of jobs to run in parallel whenadata
is a list. Ifn_jobs=-1
, all CPUs will be used.
- Returns:
If
inplace = True
, directly adds the results toadata.obs
. Otherwise return a dictionary containing the results.- Return type:
dict[str, list[float]] | list[dict[str, list[float]]] | None
Examples
>>> import snapatac2 as snap >>> data = snap.pp.import_data(snap.datasets.pbmc500(downsample=True), chrom_sizes=snap.genome.hg38, sorted_by_barcode=False) >>> snap.metrics.frip(data, {"peaks_frac": snap.datasets.cre_HEA()}) >>> print(data.obs['peaks_frac'].head()) AAACTGCAGACTCGGA-1 0.715930 AAAGATGCACCTATTT-1 0.697364 AAAGATGCAGATACAA-1 0.713615 AAAGGGCTCGCTCTAC-1 0.678428 AAATGAGAGTCCCGCA-1 0.724910 Name: peaks_frac, dtype: float64