Preprocessing: pp#

BAM/Fragment file processing#

pp.make_fragment_file(bam_file, output_file)

Convert a BAM file to a fragment file.

pp.import_fragments(fragment_file, ...[, ...])

Import data fragment files and compute basic QC metrics.

pp.import_values(input_dir, chrom_sizes, *)

Import values associated with base pairs, typically from experiments like whole-genome bisulfite sequencing (WGBS).

pp.import_contacts(contact_file, chrom_sizes, *)

Import chromatin contacts.

pp.call_cells(data, use_rep[, inplace, n_jobs])

Calling cells based on the number of feature counts.

Matrix operation#

pp.add_tile_matrix(adata, *[, bin_size, ...])

Generate cell by bin count matrix.

pp.make_peak_matrix(adata, *[, use_rep, ...])

Generate cell by peak count matrix.

pp.make_gene_matrix(adata, gene_anno, *[, ...])

Generate cell by gene activity matrix.

pp.filter_cells(data[, min_counts, ...])

Filter cell outliers based on counts and numbers of genes expressed.

pp.select_features(adata[, n_features, ...])

Perform feature selection by selecting the most accessibile features across all cells unless max_iter > 1.

pp.knn(adata[, n_neighbors, use_dims, ...])

Compute a neighborhood graph of observations.

Doublet removal#

pp.scrublet(adata[, features, n_comps, ...])

Compute probability of being a doublet using the scrublet algorithm.

pp.filter_doublets(adata[, ...])

Remove doublets according to the doublet probability or doublet score.

Data Integration#

pp.mnc_correct(adata, *, batch[, ...])

A modified MNN-Correct algorithm based on cluster centroid.

pp.harmony(adata, *, batch[, use_rep, ...])

Use harmonypy to integrate different experiments.

pp.scanorama_integrate(adata, *, batch[, ...])

Use Scanorama [Hie19] to integrate different experiments.