Prioritization of Schizophrenia Risk Genes by a Network-Regularized Logistic Regression Method

2016 
Schizophrenia (SCZ) is a severe mental disorder with a large genetic component. While recent large-scale microarray- and sequencing-based genome wide association studies have made significant progress toward finding SCZ risk variants and genes of subtle effect, the interactions among them were not considered in those studies. Using a protein-protein interaction network both in our regression model and to generate a SCZ gene subnetwork, we developed an analytical framework with Logit-Lapnet, the graphical Laplacian-regularized logistic regression, for whole exome sequencing (WES) data analysis to detect SCZ gene subnetworks. Using simulated data from sequencing-based association study, we compared the performances of Logit-Lapnet with other logistic regression (LR)-based models. We use Logit-Lapnet to prioritize genes according to their coefficients and select top-ranked genes as seeds to generate the gene sub-network that is associated to SCZ. The comparison demonstrated not only the applicability but also better performance of Logit-Lapnet to score disease risk genes using sequencing-based association data. We applied our method to SCZ whole exome sequencing data and selected top-ranked risk genes, the majority of which are either known SCZ genes or genes potentially associated with SCZ. We then used the seed genes to construct SCZ gene subnetworks. This result demonstrates that by ranking gene according to their disease contributions our method scores and thus prioritizes disease risk genes for further investigation. An implementation of our approach in MATLAB is freely available for download at: http://zdzlab.einstein.yu.edu/1/publications/LapNet-MATLAB.zip.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    41
    References
    1
    Citations
    NaN
    KQI
    []