SIMPATI: patient classifier identifies signature pathways as patient similarity networks for the disease prediction

2021 
BACKGROUNDPathway-based patient classification is a supervised learning task which supports the decision-making process of human experts in biomedical applications providing signature pathways associated to a patient class characterized by a specific clinical outcome. The task can potentially include to simulate the human way of thinking in predicting patients by pathways, decipher hidden multivariate relationships between the characteristics of patient class and provide more information than a probability value. However, these classifiers are rarely integrated into a routine bioinformatics analysis of high-dimensional biological data because they require a nontrivial hyper-parameter tuning, are difficult to interpret and lack in providing new insights. There is the need of new classifiers which can provide novel perspectives about pathways, be easy to apply with different biological omics and produce new data enabling a further analysis of the patients. RESULTSWe propose Simpati, a pathway-based patient classifier which combines the concepts of network-based propagation, patient similarity network, cohesive subgroup detection and pathway enrichment. It exploits a propagation algorithm to classify both dense, sparse, and non-homogenous data. It handles patients features (e.g. genes, proteins, mutations) organizing them in pathways represented by patient similarity networks for being interpretable, handling missing data and preserving the patient privacy. A network represents patients as nodes and a novel similarity determines how much every pair act co-ordinately in a pathway. Simpati detects signature biological processes based on how much the topological properties of the related networks discriminate the patient classes. In this step, it includes a novel cohesive subgroup detection algorithm to handle patients not showing the same pathway activity as the other class members. An unknown patient is classified based on how much is similar with known ones. Simpati outperforms state-of-art classifiers on five cancer datasets, classifies well sparse data and provides a novel concept of enrichment which calls pathways as up or down involved with respect the overall patients biology. CONCLUSIONSimpati can serve as interpretable accurate pathway-based patient classifier to discover novel signature pathways driving a clinical class, to detect biomarkers and to get insights about how patients are similar based on their regulation of biological processes. The biomarker detection is made possible with the propagation score, likelihood of association between the patients feature and outcome, and with the deconvolution of the single features contributions in the patient similarities. The pathway enrichment is enhanced with the integration of the Disgnet and the Human Protein Atlas databases. We provide an R implementation which enables to start Simpati with one function, a GUI interface for the navigation of the patients propagated profiles and a function which offers an ad-hoc visualization of patient similarity networks. The software is available at: https://github.com/LucaGiudice/Simpati
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    78
    References
    0
    Citations
    NaN
    KQI
    []