A Multi-Objective Optimization Procedure for Solving the High-Order Epistasis Detection Problem

2019 
Abstract There are multiple research works that establish a relationship between Single Nucleotide Polymorphisms (SNPs) and complex diseases. In many cases, these diseases are caused by the interaction of two or more SNPs (epistasis). Therefore, it is important to identify which SNPs lead to the emergence of diseases. However, this problem gets harder when the epistasis order and the number of SNPs in the dataset under study are increased. To tackle this problem, this work presents an implementation of a multi-objective optimization algorithm with a problem-aware offspring process. Furthermore, due to the large amount of data in the SNPs datasets, we have also parallelized this algorithm. We experimentally validate the quality of the proposal under epistasis sizes of 2, 5, and 8 loci, although the application is not limited to those values. Moreover, a thorough comparative study of biological performance with six state-of-the-art methods has been conducted, obtaining better results than all the other biological methods. Our results confirm that, using the proposed approach, large datasets with high-order epistasis sizes can be processed in reasonable time, obtaining solutions of good quality. According to the literature review performed until the acceptance of this article, there are no other authors’ works that can process epistasis sizes above 6 loci in large datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    2
    Citations
    NaN
    KQI
    []