Early Prediction of Sepsis Using Random Forest Classification for Imbalanced Clinical Data

2019 
The early prediction of sepsis in intensive care units using clinical data is the objective of the PhysioNet/Computing in Cardiology Challenge 2019. In this paper, a machine learning approach is presented which uses an optimized Random Forest for prediction of a septic condition. After an initial data augmentation step, a customized learning process is performed for the trees to consider imbalance in the dataset. Finally, a feature reduction is implemented and the forest is trimmed to 50 trees for an optimal classification in terms of run time and accuracy. Using a 10-fold cross-validation on the complete training dataset, a mean utility score of 0.376 is achieved. In the final submission, a normalized observed utility score of 0.296 on the full test set is achieved. Our team name is The Septic Think Tank (final rank: 21).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    12
    Citations
    NaN
    KQI
    []