Just Add Data: Automated Predictive Modeling and BioSignature Discovery
2020
Fully automated machine learning, statistical modelling, and artificial intelligence for predictive modeling is becoming a reality, giving rise to the field of Automated Machine Learning (Au-toML). AutoML systems promise to democratize data analysis to non-experts, drastically in-crease productivity, improve replicability of the statistical analysis, facilitate the interpretation of results, and shield against common methodological analysis pitfalls. We present the basic ideas and principles of Just Add Data Bio (JADBIO), an AutoML technology applicable to the low-sample, high-dimensional omics data that arise in translational medicine and bioinformatics appli-cations. In addition to predictive and diagnostic models ready for clinical use, JADBIO also re-turns the corresponding biosignatures, i.e., minimal-size subsets of biomarkers that are jointly predictive of the outcome of interest. A use-case on thymic epithelial tumors is presented, along with an extensive evaluation on 374 public biological datasets. Results show that long-standing challenges with overfitting and overestimation of complex non-linear machine learning pipelines on high-dimensional, low small sample data can be overcome.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
59
References
13
Citations
NaN
KQI