Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure

2021 
Chromatin accessibility, as measured by ATACseq, varies between hematopoietic cell types in different lineages of the hematopoietic differentiation tree, e.g. T cells vs B cells, but methods that associate variation in chromatin accessibility to the lineage structure of the differentiation tree are lacking. Using an ATACseq dataset recently published by the ImmGen consortium, we construct associations between chromatin accessibility and hematopoietic cell types using a novel co-clustering approach that accounts for the structure of the hematopoietic, differentiation tree. Under a model in which all loci and cell types within a co-cluster have a shared accessibility state, we show that roughly $80\%$ of cell type associated accessibility variation can be captured through $12$ cell type clusters and $20$ genomic locus clusters, with the cell type clusters reflecting coherent components of the differentiation tree. Using publicly available ChIPseq datasets, we show that our clustering reflects transcription factor binding patterns with implications for regulation across cell types. \revision{We show that traditional methods such as hierarchical and kmeans clusterings lead to cell type clusters that are more dispersed on the tree than our tree-based algorithm. We provide a python package, \textbf{chromcocluster}, that implements the algorithms presented.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    45
    References
    0
    Citations
    NaN
    KQI
    []