Precise diagnosis of three top cancers using dbGaP data.

2021 
The challenge of decoding information about complex diseases hidden in huge number of single nucleotide polymorphism (SNP) genotypes is undertaken based on five dbGaP studies. Current genome-wide association studies have successfully identified many high-risk SNPs associated with diseases, but precise diagnostic models for complex diseases by these or more other SNP genotypes are still unavailable in the literature. We report that lung cancer, breast cancer and prostate cancer as the first three top cancers worldwide can be predicted precisely via 240-370 SNPs with accuracy up to 99% according to leave-one-out and 10-fold cross-validation. Our findings (1) confirm an early guess of Dr. Mitchell H. Gail that about 300 SNPs are needed to improve risk forecasts for breast cancer, (2) reveal an incredible fact that SNP genotypes may contain almost all information that one wants to know, and (3) show a hopeful possibility that complex diseases can be precisely diagnosed by means of SNP genotypes without using phenotypical features. In short words, information hidden in SNP genotypes can be extracted in efficient ways to make precise diagnoses for complex diseases.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    1
    Citations
    NaN
    KQI
    []