Machine learning systems and precision medicine: A conceptual and experimental approach to single individual statistics

2020 
Abstract The translation of precision medicine in clinical practice will depend mostly on the possibility to make statistical inference at the individual level, exactly positioning a new case in the taxonomy space (diagnosis) or in the time space (prognosis). As a matter of the fact, clinical epidemiology and medical statistics have not been suited to answer specific questions at the individual level. They focus on groups of individuals and not on single individuals. Classical statistics by definition needs samples to work, and samples by definition are always greater than one. This explains why for traditional statistics the single individual is a sort of moving and vague target to intercept. The objective of this chapter is to show the feasibility of the use of potent machine learning system developed at the Semeion Institute in approaching the problem of single individual statistics in a consistent and sound way. Three case studies relevant to different unsupervised machine learning systems are reviewed: (a) the use of self-organizing maps (SOMs) to determine the confidence interval of a quality of life scale total score in seven new individual subjects having a group of 1000 individuals as reference data set; (b) the use of the evolutionary algorithm “pick and squash tracking” (PST) to cluster and discriminate patients affected by Barrett disease from those affected by simple gastroesophageal reflux disease, (c) the use of the auto-contractive map (Auto-CM) system, a fourth-generation artificial neural network, to map individual patients with and without acute myocardial infarction (AMI) on the basis of genetic, clinical traits, and classical risk factors. A further case study is described relying on the use of supervised machine learning systems, based on the concept of Fermi mathematics. The three unsupervised methods proved to be reliable and easily applicable to real-world examples in terms of readability, accuracy, and reproducibility. The confidence interval related to the seven new cases in the first case study allowed the clinician to identify easily the outlier. The accuracy of the map projection with PST algorithm in the second case study allowed an immediate visual evidence of the degree of membership of each individual subject to the two diagnostic classes. In the third case study, the overall accuracy of clustering obtained by the Auto-CM system resulted is found to be 93%.The conceptual advantages obtainable are explained. The fourth method shows that it is possible by using several independent classification models on the same individual to establish a degree of confidence of the prediction and therefore to overcome the dogma that it is not possible to make a statistical inference when a sample is composed by just one subject.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []