Finding Correlated Patterns via High-Order Matching for Multiple Sourced Biological Data

2019 
Objective: The emergence of multidimensional genomic data poses new challenges in data analysis. Finding correlated patterns within multiple-sourced biological data is useful in understanding potential interactions between the multimodal genomic data. Methods: Multidimensional genomic data contain multiple genomic data types, and different types of genomic data have different scales and units. These data cannot simply be aggregated for analysis. To address this issue, a correlated pattern discovery model incorporating prior knowledge is proposed. Tensor similarity is used to measure the correlation between common patterns. The model is combined with prior knowledge, the expression of which is transformed into constraints. Efficient numerical solutions are designed and analyzed. Results: The proposed method is shown to perform robustly and effectively with both simulated data and real biological data. We conduct experiments on five real cancer data sets to reveal various cancer subtypes. A survival analysis of these subtypes confirms the effectiveness of the model. Conclusion: We introduce a correlated pattern discovery model incorporating prior knowledge. This model is meaningful for the realization of personalized diagnoses by doctors in the treatment of cancer and other diseases. Significance: The problem of finding correlated patterns from multiple-sourced biological data was formulated as a high-order graph matching problem, and the prior knowledge data were seamlessly incorporated into the matching model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    5
    Citations
    NaN
    KQI
    []