Combining Sparse Group Lasso and Linear Mixed Model Improves Power to Detect Genetic Variants Underlying Quantitative Traits

2019 
Genome-Wide association studies (GWAS), based on testing one single nucleotide polymorphism at a time, have revolutionized our understanding of the genetics of complex traits. In GWAS, there is a need to consider confounding effects due to population structure, and take groups of single nucleotide polymorphisms (SNPs, a variation in a single nucleotide that occurs at a specific position in the genome) into account simultaneously due to “polygenic” attribute of complex quantitative traits. In this paper, we propose a new approach which effectively combines sparse group lasso (SGL) and linear mixed model (LMM), called SGL-LMM, for multivariate associations of quantitative traits. LMM is a popular approach to deal with confounding and SGL is a well-known method to maintain sparsity of multivariate regression model. We first set a fixed effect as zero to learn the parameters of random effects using LMM, and then we estimate fixed effects with SGL regularization. Furthermore, we developed efficient algorithms to tune the hyperparameters and for feature selection by stability selection. SGL-LMM benefits from LMM and SGL for correcting population structure and for a sparse solution, respectively, but it also provides a natural way of imposing prior biological information through group structure into the model. SGL-LMM is feasible for GWAS in humans and other organisms. In experiments with both simulated and real world data, our method outperformed previous approaches in the power to detect associations and in the accuracy to predict phenotypes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    4
    Citations
    NaN
    KQI
    []