Challenges in Developing Prediction Models for Multi-modal High-Throughput Biomedical Data.

2018 
Rapid advances in high-throughput technologies have provided variant types of biological data like gene expression, copy number alterations, miRNA expression, and protein expression. The integration of diverse biomedical datasets have received wide attention because of its great potential to build a range of models which can be more thorough about the mechanisms of cancers and other complicated diseases. However, the impact of constructing a prediction model from heterogeneous high-throughput datasets is not comprehensively defined. This paper identifies the challenges related to issues of developing prediction approaches for Multi-Modal High Dimensional and Small Sample Size biomedical datasets. The various challenges encountered are based on the characteristics of the data, the aim of the integration, and the level of the integration. Heterogeneity and dimensionality of high-throughput data bring many computational and statistical challenges. Thus fusing them into a unified and informative space for prediction purposes is a difficult task. Furthermore, validating and evaluating the outcomes of the prediction models built from multi-modal biomedical datasets, involve several underlying issues that need to be properly handled, in order to report reliable findings. Moreover, interpretability and robustness of the prediction models are becoming crucial factors for personalised medicine. The directions are introduced briefly to address these challenges and some possibilities for future work are discussed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    41
    References
    0
    Citations
    NaN
    KQI
    []