Modeling Clinical Trial Attrition Using Machine Intelligence: A driver analytics case study using 1,325 trials representing one million patients

2021 
The amount of time and resources invested in bringing novel therapeutics to market has increased year over year with fewer successful treatments reaching patients. In the lifecycle of drug development, the clinical phase is a major contributor to this decreasing efficiency in the development of clinical trials. One major barrier to the successful execution of a randomized control trial (RCT) is the attrition of patients who no longer participate in a trial either following enrollment or randomization. To address this problem, we have assembled a unique dataset by integrating multiple public databases including ClinicalTrials.gov and Aggregate Analysis of ClincalTrials.gov (AACT) to assemble a trial sponsor-independent dataset. This data spans 20 years of clinical trials and over 1 million patients (3,175 cohorts consisting of 1,020,085 patients and 79 curated features) in the respiratory domain and enabled a data-driven approach to identify top features influencing patient attrition in a trial. Top Features included Duration of Trial, Duration of Treatment, Indication, and Number of Adverse Events. We evaluated multiple machine learning models and found the best performance on the Test Set with Random Forest (Test subset: n=637 cohorts; RMSE 6.64). We envisage that our work will enable clinical trial sponsors to optimize trial run time by better anticipating and correcting for potential patient attrition using patient-centric strategies to improve patient engagement, thus enabling new therapies to be delivered to patients more quickly.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []