CoRE-ATAC: A Deep Learning model for the functional Classification of Regulatory Elements from single cell and bulk ATAC-seq data
2020
Cis-Regulatory elements (cis-REs) include promoters, enhancers, and insulators that regulate gene expression programs via binding of transcription factors. ATAC-seq technology effectively identifies active cis-REs in a given cell type (including from single cells) by mapping the accessible chromatin at base-pair resolution. However, these maps are not immediately useful for inferring specific functions of cis-REs. For this purpose, we developed a deep learning framework (CoRE-ATAC) with novel data encoders that integrate DNA sequence (reference or personal genotypes) and ATAC-seq read pileups. CoRE-ATAC was trained on 4 cell types (n=6 samples/replicates) and accurately predicted known cis-RE functions from 7 cell types (n=40 samples) that were not used in model training (average precision=0.80). CoRE-ATAC enhancer predictions from 19 human islets coincided with genetically modulated gain/loss of enhancer activity, which was confirmed by massively parallel reporter assays (MPRAs). Finally, CoRE-ATAC effectively inferred functionality of cis-REs from aggregate single nucleus ATAC-seq (snATAC) data from human blood-derived immune cells that overlapped well with known functional annotations in sorted immune cells. Performances on snATAC-seq data demonstrate CoRE-ATAC9s ability to infer cis-RE function in rare cell populations that can be identified by unsupervised clustering of snATAC-seq cells but difficult to capture in bulk ATAC-seq. ATAC-seq maps from primary human cells reveal individual- and cell-specific variation in cis-RE activity. CoRE-ATAC increases the functional resolution of these maps, a critical step for studying regulatory disruptions behind diseases.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
51
References
3
Citations
NaN
KQI