CATCH-UP: A High-Throughput Upstream-Pipeline for Bulk ATAC-Seq and ChIP-Seq Data.
Riva SG., Georgiades E., Gur ER., Baxter M., Hughes JR.
Assay for transposase-accessible chromatin (ATAC) and chromatin immunoprecipitation (ChIP), coupled with next-generation sequencing (NGS), have revolutionized the study of gene regulation. A lack of standardization in the analysis of the highly dimensional datasets generated by these techniques has made reproducibility difficult to achieve, leading to discrepancies in the published, processed data. Part of this problem is due to the diverse range of bioinformatic tools available for the analysis of these types of data. Secondly, a number of different bioinformatic tools are required sequentially to convert raw data into a fully processed and interpretable output, and these tools require varying levels of computational skills. Furthermore, there are many options for quality control that are not uniformly employed during data processing. We address these issues with a complete assay for transposase-accessible chromatin sequencing (ATAC-seq) and chromatin immunoprecipitation sequencing (ChIP-seq) upstream pipeline (CATCH-UP), an easy-to-use, Python-based pipeline for the analysis of bulk ChIP-seq and ATAC-seq datasets from raw fastq files to visualizable bigwig tracks and peaks calls. This pipeline is simple to install and run, requiring minimal computational knowledge. The pipeline is modular, scalable, and parallelizable on various computing infrastructures, allowing for easy reporting of methodology to enable reproducible analysis of novel or published datasets.