snapatac2.tl.macs3#
- snapatac2.tl.macs3(adata, *, groupby=None, qvalue=0.05, call_broad_peaks=False, broad_cutoff=0.1, replicate=None, replicate_qvalue=None, max_frag_size=None, selections=None, nolambda=False, shift=-100, extsize=200, min_len=None, blacklist=None, key_added='macs3', tempdir=None, inplace=True, n_jobs=8)[source]#
Call peaks using MACS3.
- Parameters:
adata (AnnData | AnnDataSet) – The (annotated) data matrix of shape
n_obs
xn_vars
. Rows correspond to cells and columns to regions.groupby (str | list[str] | None) – Group the cells before peak calling. If a
str
, groups are obtained from.obs[groupby]
. If None, peaks will be called for all cells.qvalue (float) – qvalue cutoff used in MACS3.
call_broad_peaks (bool) – If True, MACS3 will call broad peaks. The broad peak calling process utilizes two distinct cutoffs to discern broader, weaker peaks (
broad_cutoff
) and narrower, stronger peaks (qvalue
), which are subsequently nested to provide a detailed peak landscape. To conceptualize “nested” peaks, picture a gene structure housing regions analogous to exons (strong peaks) and introns coupled with UTRs (weak peaks). Please note that, if you only want to call “broader” peak and not interested in the nested peak structure, please simply useqvalue
with weaker cutoff instead of usingcall_broad_peaks
option.broad_cutoff (float) – qvalue cutoff used in MACS3 for calling broad peaks.
replicate (str | list[str] | None) – Replicate information. If provided, reproducible peaks will be called for each group.
replicate_qvalue (float | None) – qvalue cutoff used in MACS3 for calling peaks in replicates. This parameter is only used when
replicate
is provided. Typically this parameter is used to call peaks in replicates with a more lenient cutoff. If not provided,qvalue
will be used.max_frag_size (int | None) – Maximum fragment size. If provided, fragments with sizes larger than
max_frag_size
will be not be used in peak calling. This is used in ATAC-seq data to remove fragments that are not from nucleosome-free regions. You can usefrag_size_distr
to choose a proper value for this parameter.selections (set[str] | None) – Call peaks for the selected groups only.
nolambda (bool) – If True, macs3 will use the background lambda as local lambda. This means macs3 will not consider the local bias at peak candidate regions.
shift (int) – The shift size in MACS.
extsize (int) – The extension size in MACS.
min_len (int | None) – The minimum length of a called peak. If None, it is set to
extsize
.blacklist (Path | None) – Path to the blacklist file in BED format. If provided, regions in the blacklist will be removed.
key_added (str) –
.uns
key under which to add the peak information.tempdir (Path | None) – If provided, a temporary directory will be created in the directory. Otherwise, a temporary directory will be created in the system default temporary directory.
inplace (bool) – Whether to store the result inplace.
n_jobs (int) – Number of processes to use for peak calling.
- Returns:
If
inplace=True
it stores the result inadata.uns[`key_added
]`. Otherwise, it returns the result as dataframes.- Return type:
See also