Power analysis of transcriptome-wide association study
2020
Background: Standard Genome-wide association study (GWAS) discovers genetic variants explaining phenotypic variance by directly associate them. With the availability of other omics data such as gene expression, the field is stepping into an exciting era of multi-scale omics integration. An emerging technique is transcriptome-wide association study (TWAS) that conducts association mapping by utilizing gene expression data from a separate reference dataset based on which a model predicting expression by genotype is trained. Despite its success in practice, two fundamental questions have been unaddressed yet. First, in practice, the accuracy of predicting expression by genotype is generally low, which is bounded by the expression heritability. So, the question is whether such a low accuracy may impact the power of TWAS, and what level of accuracy is sufficient. Second, since predicting expression is a critical step in TWAS, one may ask what if we have actual expression assessed by a real experiment, and whether that will improve or deteriorate power. Answering these questions will bring thorough understanding of TWAS and practical guidelines in association mapping. Results: To address the above questions, we conducted power analysis for GWAS, TWAS, and expression medicated GWAS (emGWAS). Specifically, we derived non-centrality parameters (NCPs), enabling closed-form derivation of statistical power to facilitate a thorough power analysis without relying on particular implementations. We assessed the power of the three protocols with respect to two representative scenarios: causality (genotype contributes to phenotype through expression) and pleiotropy (genotype contributes directly to both phenotype and expression). For both scenarios, we tested various properties including expression heritability. Conclusions: (1) TWAS utilizing predicted expression enjoys higher power than emGWAS that has actual expressions in the pleiotropy scenario, revealing a deep insight into TWAS models as well as a practical guideline of applying TWAS even in cases when expressions are available in a GWAS dataset. (2) TWAS is suboptimal compared to GWAS when expression heritability is too low. The superiority ordering of TWAS and GWAS disclosed a turn-point in each of the causality and pleiotropy scenarios. Analysis of published discoveries shows the selection of protocols might be questionable based on the identified turn-points.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
51
References
2
Citations
NaN
KQI