Gene Embeddings of Complex network (GECo) and hypertension disease gene classification

2020 
Diseases such as hypertension, cancer, and diabetes are the causes of nearly 70% of the deaths in the U.S. Such complex diseases involve multiple genes and their interactions with environmental factors. Therefore, identification of genetic factors to understand and decrease the morbidity rates of those complex diseases is an important and challenging task. With the generation of an unprecedented amount of multi-omics datasets, network-based methods have become popular to represent the multilayered complex molecular interactions. Particularly network embeddings, the low-dimensional representations of nodes in a network are utilized for gene function prediction. Most of the network embedding methods, however, could not integrate multiple types of datasets from genes and phenotypes. This is an important limitation as multi-omics data integration alleviates the issues due to missing data and lack of context-specific data. To address this limitation, we developed a network embedding algorithm named GECo that can utilize multilayered heterogeneous networks of genes and phenotypes. We evaluated the performance of GECo using genotypic and phenotypic datasets of the model organism Rattus norvegicus to classify hypertension disease related genes. Our method significantly outperformed the state-of-the-art network embedding methods by 94.97% AUC in prediction where the second-best performer achieved 85.98% AUC.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    84
    References
    1
    Citations
    NaN
    KQI
    []