Preprocessing: pp#

BAM/Fragment file processing#

pp.make_fragment_file(bam_file, output_file)

Convert a BAM file to a fragment file.

pp.import_data(fragment_file, chrom_sizes, *)

Import data fragment files and compute basic QC metrics.

Matrix operation#

pp.add_tile_matrix(adata, *[, bin_size, ...])

Generate cell by bin count matrix.

pp.make_peak_matrix(adata, *[, use_rep, ...])

Generate cell by peak count matrix.

pp.make_gene_matrix(adata, gene_anno, *[, ...])

Generate cell by gene activity matrix.

pp.filter_cells(data[, min_counts, ...])

Filter cell outliers based on counts and numbers of genes expressed.

pp.select_features(adata[, n_features, ...])

Perform feature selection by selecting the most accessibile features across all cells unless max_iter > 1.

pp.knn(adata[, n_neighbors, use_dims, ...])

Compute a neighborhood graph of observations.

Doublet removal#

pp.scrublet(adata[, features, n_comps, ...])

Compute probability of being a doublet using the scrublet algorithm.

pp.filter_doublets(adata[, ...])

Remove doublets according to the doublet probability or doublet score.

Data Integration#

pp.mnc_correct(adata, *, batch[, ...])

A modified MNN-Correct algorithm based on cluster centroid.

pp.harmony(adata, *, batch[, use_rep, ...])

Use harmonypy to integrate different experiments.

pp.scanorama_integrate(adata, *, batch[, ...])

Use Scanorama [Hie19] to integrate different experiments.