Gene Presence and Absence in Genomic Big Data for Precision Medicine

2018 
The twenty–first-century precision medicine aims at using a systems-oriented approach to find the root cause of disease specific to an individual by including molecular pathology tests. The challenges of genomic data analysis for precision medicine are multifold, they are a combination of big data, high dimensionality, and with often multimodal distributions. Advanced investigations use techniques such as Next Generation Sequencing (NGS) which rely on complex statistical methods for gaining useful insights. Analysis of the exome and transcriptome data allow for in-depth study of the 22 thousand genes in the human body, many of which relate to phenotype and disease state. Not all genes are expressed in all tissues. In disease state, some genes are even deleted in the genome. Therefore, as part of knowledge discovery, exome and transcriptome big data needs to be analyzed to determine whether a gene is actually absent (deleted/not expressed) or present. In this paper, we present a statistical technique to identify the genes that are present or absent in exome or transcriptome data (big data) to improve the accuracy for precision medicine.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []