Development of a Predictive Model for a Machine Learning Derived Shoulder Arthroplasty Clinical Outcome Score

2021 
Abstract Introduction We use machine learning to create predictive models from preoperative data to predict the Shoulder Arthroplasty Smart (SAS) score, the American Shoulder and Elbow Surgeons (ASES) score, and the Constant score at multiple postoperative timepoints and compare the accuracy of each algorithm for anatomic (aTSA) and reverse (rTSA) total shoulder arthroplasty. Methods Clinical data from 2,270 aTSA and 4,198 rTSA patients were analyzed using 3 supervised machine learning techniques to create predictive models for the SAS, ASES, and Constant score at 6 different postoperative timepoints using a full input feature set and the 2 different minimal feature sets. Mean absolute errors (MAE) quantified the difference between actual and predicted outcome scores for each model at each postoperative timepoint. The performance of each model was also quantified by its ability to predict improvement greater than the minimal clinically important difference (MCID) and the substantial clinical benefit (SCB) patient satisfaction thresholds for each outcome measure at 2-3 years after surgery. Results All 3 machine learning techniques were more accurate at predicting aTSA and rTSA outcomes using the SAS score (aTSA: ±7.41 MAE; rTSA: ±7.79 MAE), followed by the Constant score (aTSA: ±8.32 MAE; rTSA: ±8.30 MAE), and finally the ASES score (aTSA: ±10.86 MAE; rTSA: ±10.60 MAE). These prediction accuracy trends were maintained across the 3 different model input categories for each of the SAS, ASES, and Constant models at each postoperative timepoint. For aTSA patients, the XGBoost predictive models achieved 94-97% accuracy in MCID with an AUROC between 0.90-0.97 and 89-94% accuracy in SCB with an AUROC between 0.89-0.92 for the 3 clinical scores using the full feature set of inputs. For rTSA patients, the XGBoost predictive models achieved 95-99% accuracy in MCID with an AUROC between 0.88-0.96 and 88-92% accuracy in SCB with an AUROC between 0.81-0.89 for the 3 clinical scores using the full feature set of inputs. Discussion Our study demonstrated that the SAS score predictions are more accurate than the ASES and Constant predictions for multiple supervised machine learning techniques, despite requiring less input data for the SAS model. Additionally, we predicted which patients will, and will not achieve clinical improvement that exceeds the MCID and SCB thresholds for each score; this highly accurate predictive capability effectively risk-stratifies patients for a variety of outcome measures using only preoperative data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    0
    Citations
    NaN
    KQI
    []