Evolutionary-based methods for predicting genotype-phenotype associations in the mammalian genome

2019 
Phenotypic and genotypic variation between species are the result of millions of experiments performed by nature. Understanding why and how phenotypic complexity arises is a central goal of evolutionary biology. Technological advancements enabling whole genome sequencing have laid the foundation for developing comparative genomics-based tools for inferring genetic elements underlying phenotypic adaptations. The work covered as part of this thesis will develop these tools drawing from principles of convergent evolution, aimed at generating specific functional hypotheses that can help focus experimental efforts. These tools will be relevant for characterizing context-specific functions of cis-regulatory elements as well as protein-coding genes, where a large number lack functional annotation beyond domain homology. Expanding from one-dimensional approaches studying proteins in isolation, we propose to build an integrated co-evolutionary framework that will serve as a powerful tool for protein interaction prediction. In this dissertation, we discuss these ideas through the following three projects. In chapter 1, we perform a genome-wide scan for genes showing convergent rate changes in four subterranean mammals, and study the underlying changes in selective pressure causing these convergent shifts in rate. Using a new variant of our rates-based method, we demonstrate that eye-specific regulatory regions show strong rate accelerations in the subterranean mammals. This study demonstrates the potential of convergent evolution-based tools in the functional annotation of eye-specific genetic elements. In chapter 2, we build a robust method to infer shifts in rate associated with a wide range of evolutionary scenarios. We investigate the statistical underpinnings of our rates-based framework and identify the best performing variant of our method across real and simulated phylogenetic datasets. We distribute these tools to the research community, enabling large scale generation of specific functional hypotheses for regulatory regions. In chapter 3, we propose to construct a powerful framework for protein interaction prediction using integration of proteome-wide co-evolutionary signatures. We systematically benchmark the predictions of our coevolutionary framework using known functional interactions among proteins across various scales. We make the predictions of the framework publicly available, useful for functional annotation of less well-characterized genes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []