Identifying topologically associating domains and subdomains by Gaussian Mixture model And Proportion test

2017 
The spatial organization of the genome plays a critical role in regulating gene expression. Recent chromatin interaction mapping studies have revealed that topologically associating domains and subdomains are fundamental building blocks of the three-dimensional genome. Identifying such hierarchical structures is a critical step toward understanding the three-dimensional structure–function relationship of the genome. Existing computational algorithms lack statistical assessment of domain predictions and are computationally inefficient for high-resolution Hi-C data. We introduce the Gaussian Mixture model And Proportion test (GMAP) algorithm to address the above-mentioned challenges. Using simulated and experimental Hi-C data, we show that domains identified by GMAP are more consistent with multiple lines of supporting evidence than three state-of-the-art methods. Application of GMAP to normal and cancer cells reveals several unique features of subdomain boundary as compared to domain boundary, including its higher dynamics across cell types and enrichment for somatic mutations in cancer. Spatial organization of the genome plays a crucial role in regulating gene expression. Here the authors introduce GMAP, the Gaussian Mixture model And Proportion test, to identify topologically associating domains and subdomains in Hi-C data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    46
    Citations
    NaN
    KQI
    []