The topological organisation of nucleosomes along the genome is not uniform, it’s highly regulated and dynamic. Some parts of the genome are fully occupied by nucleosomes and therefore closed and inaccessible, while others are deprived of histones making them exposed to polymerase II and transcription factors. This organisation plays a crucial role in regulating gene expression and defining cell types and states.
Therefore, many methods have been developed to study which parts of the DNA are open and accessible, and which parts are not. ATAC, developed in 2013, is one of these methods and has gained popularity due to its speed, simplicity, clarity, and low requirement of material. Two years later, single-cell ATAC-seq protocols were published in Science and Nature which have allowed for the study of heterogeneous cell populations.
Experimentally, scATAC-seq is fairly straightforward, however, its data interpretation processing is complex. The preprocessing and analysis of single-cell ATAC-seq data requires strict quality control, correction for batch effects, and handling of sequencing errors. Aggregating data into different features, such as peak-based or motif-based approaches, helps reduce noise and dimensionality.
Conducting analysis from different angles, including peak, gene, and motif levels, is essential for understanding differential accessibility and gene regulation networks. This involves peak calling, gene activity estimation, and motif enrichment analysis.
This presentation outlines a new computational tool, scATACpipe, to comprehensively integrate multiple analysis methods, allowing users to perform end-to-end analysis without extensive software management. This tool is modular, cross-platform, and supports any organism with annotated genomes, enhancing accessibility and usability for researchers. Users are provided with an interactive HTML report that includes comprehensive information including FASTQ data, QC, cells, and doublets.