A convex optimization framework for gene-level tissue network estimation with missing data and its application in understanding disease architecture

2020 
Genes with correlated expression across individuals in multiple tissues are potentially informative for systemic genetic activity spanning these tissues. In this context, the tissue-level gene expression data across multiple subjects from the Genotype Tissue Expression (GTEx) Project is a valuable analytical resource. Unfortunately, the GTEx data is fraught with missing entries owing to subjects often contributing only a subset of tissues. In such a scenario, standard techniques of correlation matrix estimation with or without data imputation do not perform well. Here we propose Robocov, a novel convex optimization-based framework for robustly learning sparse covariance or inverse covariance matrices for missing data problems. Robocov produces more interpretable and less cluttered visual representation of correlation and causal structure in both simulation settings and GTEx data analysis. Simulation experiments also show that Robocov estimators have a lower false positive rate than competing approaches for missing data problems. Genes prioritized based on the average value of Robocov correlations or partial correlations across tissues are enriched for pathways related to systemic activities such as signaling pathways, heat stress factor, immune function and circadian clock. Furthermore, SNPs linked to these prioritized genes provide unique signal for blood-related traits; in comparison, no disease signal is observed for SNPs linked to genes prioritized by the standard correlation estimator. Robocov is an important stand-alone statistical tool for sparse correlation and causal network estimation for data with missing entries; and when applied to the GTEx data, it provides insights into both genetic and autoimmune disease architectures.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    0
    Citations
    NaN
    KQI
    []