Beyond Perfect Phylogeny: Multisample Phylogeny Reconstruction via ILP

2017 
Most of the evolutionary history reconstruction approaches are based on the infinite site assumption which is underlying the Perfect Phylogeny model. This is one of the most used models in cancer genomics. Recent results gives a strong evidence that recurrent and back mutations are present in the evolutionary history of tumors[19], thus showing that more general models then the Perfect phylogeny are required. To address this problem we propose a framework based on the notion of Incomplete Perfect Phylogeny. Our framework incorporates losing and gaining mutations, hence including the Dollo and the Camin-Sokal models, and is described with an Integer Linear Programming (ILP) formulation. Our approach generalizes the notion of persistent phylogeny[1] and the ILP approach[14,15] proposed to solve the corresponding phylogeny reconstruction problem on character data. The final goal of our paper is to integrate our approach into an ILP formulation of the problem of reconstructing trees on mixed populations, where the input data consists of the fraction of cells in a set of samples that have a certain mutation. This is a fundamental problem in cancer genomics, where the goal is to study the evolutionary history of a tumor. An experimental analysis shows that our ILP approach is able to explain data that do not fit the perfect phylogeny assumption, thereby allowing (1) multiple losses and gains of mutations, and (2) a number of subpopulations that is smaller than the number of input mutations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    25
    Citations
    NaN
    KQI
    []