Datasets#

These functions facilitate the download of public datasets and auxiliary data used in the SnapATAC2 package.

Note

You can change the data cache directory by setting the SNAP_DATA_DIR environmental variable.

Genomes#

genome.Genome(*, fasta, annotation[, ...])

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCh37

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCh38

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCm38

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCm39

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.hg19

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.hg38

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.mm10

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.mm39

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

Motifs#

datasets.cis_bp([unique])

A list of transcription factor motifs curated by the CIS-BP database.

datasets.Meuleman_2020()

A list of transcription factor motifs curated from [Meuleman20].

Raw data#

datasets.pbmc500([type, downsample])

scATAC-seq dataset of 500 PBMCs from 10x Genomics.

datasets.pbmc5k([type])

scATAC-seq dataset of 5k PBMCs from 10x Genomics.

datasets.pbmc10k_multiome([modality, type])

Single-cell multiome dataset of 10k PBMCs from 10x Genomics.

datasets.colon()

scATAC-seq datasets of five colon transverse samples from [Zhang21].

datasets.cre_HEA()

Curated cis-regulatory elements from [Zhang21].