Stacking prediction for a binary outcome

2012 
A large number of supervised classi cation models have been proposed in the literature. In order to avoid any bias induced by the use of one single statistical approach, they are combined through a speci c "stacking" eta-model. To deal with the case of a binary outcome and of categorical predictors, we introduce several improvements to stacking: combining models is done through PLS-DA instead of OLS due to the strong correlation between predictions, and a speci c methodology is developed for the case of a small number of observations, using repeated sub-sampling for variables selection. Five very di erent models (Boosting, Nai ve Bayes, SVM, Sparse PLS-DA and Expert Scoring) are combined through this improved stacking, and applied in the context of the development of alternative strategies for safety evaluation where multiple in vitro, in silico and physico-chemical parameters are used to classify substances in two classes : "Sensitizer" and "No Sensitizer". Results show that stacking meta-models have better performances than each of the five models taken separately, and furthermore, stacking provides a better balance between sensitivity and speci city.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    8
    Citations
    NaN
    KQI
    []