snapatac2.metrics.frag_size_distr#

snapatac2.metrics.frag_size_distr(adata, *, max_recorded_size=1000, add_key='frag_size_distr', inplace=True, n_jobs=8)[source]#

Compute the fragment size distribution of the dataset.

This function computes the fragment size distribution of the dataset. Note that it does not operate at the single-cell level. The result is stored in a vector where each element represents the number of fragments and the index represents the fragment length. The first posision of the vector is reserved for fragments with size larger than the max_recorded_size parameter. import_data must be ran first in order to use this function.

Parameters:
  • adata (AnnData | list[AnnData]) – The (annotated) data matrix of shape n_obs x n_vars. Rows correspond to cells and columns to regions. adata could also be a list of AnnData objects. In this case, the function will be applied to each AnnData object in parallel.

  • max_recorded_size (int) – The maximum fragment size to record in the result. Fragments with length larger than max_recorded_size will be recorded in the first position of the result vector.

  • add_key (str) – Key used to store the result in adata.uns.

  • inplace (bool) – Whether to add the results to adata.uns or return it.

  • n_jobs (int) – Number of jobs to run in parallel when adata is a list. If n_jobs=-1, all CPUs will be used.

Returns:

If inplace = True, directly adds the results to adata.uns['`add_key’]`. Otherwise return the results.

Return type:

np.ndarray | list[np.ndarray] | None