logo
    Genomics and the Human Genome Project are creating important changes in medical care and radiologic practice. Readers of this article will: Recall that deoxyribonucleic acid (DNA) is the information-containing molecule. Understand the relationship between DNA and protein synthesis. Learn that changes in DNA-encoded proteins produce pathogenesis. Distinguish between genomics and proteomics. Describe the Human Genome Project. Define and give examples of molecular medicine and its imaging applications. Discuss how genomics, genetic engineering and the Human Genome Project may influence radiologic practice.
    Genome Biology
    Personal genomics
    Structural genomics
    Citations (2)
    Background: Super-enhancers are clusters of active enhancers densely occupied by the Mediators, transcription factors and chromatin regulators, control expression of cell identity and disease associated genes. Current studies demonstrated the possibility of multiple factors with important roles in super-enhancer formation; however, a systematic analysis to asses the relative contribution of chromatin and sequence features of super-enhancers and their constituents remain unclear. In addition, a predictive model that integrates various types of data to predict super-enhancers has not been established. Results: Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of super-enhancers and their constituents and to investigate their relative contribution. Through computational modelling, we found that Cdk8, Cdk9 and Smad3 as new key features of super-enhancers along with many known. Comprehensive analysis of these features in embryonic stem cells and pro-B cells revealed their role in the super-enhancer formation and cellular identity. Further, we observed significant correlation and combinatorial predictive ability among many cofactors at the constituents of super-enhancers. By utilizing these features, we developed computational models which can accurately predict super-enhancers and their constituents. We validated these models using cross-validation and also independent datasets in four human cell-types. Conclusions: Our analysis of these features and prediction models can serve as a resource to further characterize and understand the formation of super-enhancers. Taken together, our results also suggest a possible cooperative and synergistic interactions of numerous factors at super-enhancers and their constituents. We have made available our analysis pipeline as an open-source tool with a command line interface at https://github.com/asntech/improse.
    Epigenomics
    Citations (1)
    Chromatin contacts between regulatory elements are of crucial importance for the interpretation of transcriptional regulation and the understanding of disease mechanisms. However, existing computational methods mainly focus on the prediction of interactions between enhancers and promoters, leaving enhancer-enhancer (E-E) interactions not well explored. In this work, we develop a novel deep learning approach, named Enhancer-enhancer contacts prediction (EnContact), to predict E-E contacts using genomic sequences as input. We statistically demonstrated the predicting ability of EnContact using training sets and testing sets derived from HiChIP data of seven cell lines. We also show that our model significantly outperforms other baseline methods. Besides, our model identifies finer-mapping E-E interactions from region-based chromatin contacts, where each region contains several enhancers. In addition, we identify a class of hub enhancers using the predicted E-E interactions and find that hub enhancers tend to be active across cell lines. We summarize that our EnContact model is capable of predicting E-E interactions using features automatically learned from genomic sequences.
    Sequence (biology)
    Citations (7)
    Functional Genomics
    Personal genomics
    Human genetic variation
    1000 Genomes Project
    Structural genomics
    Functional Genomics
    Structural Biology
    Genome Biology
    Computational genomics
    ABSTRACT Non-coding gene regulatory enhancers are essential to transcription in mammalian cells. As a result, a large variety of experimental and computational strategies have been developed to identify cis -regulatory enhancer sequences. In practice, most studies consider enhancers identified by only a single method, and the concordance of enhancers identified by different methods has not been comprehensively evaluated. Here, we assess the similarities of enhancer sets identified by ten representative strategies in four biological contexts and evaluate the robustness of downstream conclusions to the choice of identification strategy. All pairs of enhancer sets we evaluated overlap significantly more than expected by chance; however, we also found significant dissimilarity between enhancer sets in their genomic characteristics, evolutionary conservation, and association with functional loci within each context. We find most regions identified as enhancers are supported by only one method. The disagreement is sufficient to influence interpretation of GWAS SNPs and eQTL, and to lead to disparate conclusions about enhancer biology and disease mechanisms. We also find only limited evidence that regions identified by multiple enhancer identification methods are better candidates than those identified by a single method. Our results highlight the inherent complexity of enhancer biology and argue that current approaches have yet to adequately account for enhancer diversity. As a result, we cannot recommend the use of any single enhancer identification strategy in isolation. To facilitate assessment of enhancer diversity on studies’ conclusions, we developed creDB, a database of enhancer annotations designed to integrate into bioinformatics workflows. While our findings highlight a major challenge to mapping the genetic architecture of complex disease and interpreting regulatory variants found in patient genomes, a systematic understanding of similarities and differences in enhancer identification methodology will ultimately enable robust inferences about gene regulatory sequences.
    Genome-wide Association Study
    Citations (6)
    The high degree of similarity between the mouse and human genomes is demonstrated through analysis of the sequence of mouse chromosome 16 (Mmu 16), which was obtained as part of a whole-genome shotgun assembly of the mouse genome. The mouse genome is about 10% smaller than the human genome, owing to a lower repetitive DNA content. Comparison of the structure and protein-coding potential of Mmu 16 with that of the homologous segments of the human genome identifies regions of conserved synteny with human chromosomes (Hsa) 3, 8, 12, 16, 21, and 22. Gene content and order are highly conserved between Mmu 16 and the syntenic blocks of the human genome. Of the 731 predicted genes on Mmu 16, 509 align with orthologs on the corresponding portions of the human genome, 44 are likely paralogous to these genes, and 164 genes have homologs elsewhere in the human genome; there are 14 genes for which we could find no human counterpart.
    Synteny
    Gene density
    Citations (375)