An application of Random Forest and Hungarian Method to genome wide association of asthma in a genetic isolate of Ogliastra

2009 
We consider the problem of predicting the asthma status by estimating epistasis, namely, interactions among genetic variants and environment factors with asthma. We estimate epistasis using Random Forest (RF) using a data set with 500000 genetic variants and environmental variables that have been measured on about 200 cases and controls. The training sample for the RF is selected among a large database that contains also the genealogy tree that able us to relate all subjects. With the Hungarian method we choose the most inbred controls related to the cases. The exogenous genetic variability of such training sample is reduced if compared with that in open populations of unknown kinship. This allows us to successfully build a RF classifier which has an acceptable prediction error of the asthma status. Moreover, one of the most important genetic variant, that we associate with the asthma, has been also reported to be functionally associated with asthma in other studies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []