Data-driven risk stratification for preterm birth in Brazil: a population-based study to develop of a machine learning risk assessment approach

2021 
Abstract Background Preterm birth (PTB) is a growing health issue worldwide, currently considered the leading cause of newborn deaths. To address this challenge, the present work aims to develop an algorithm capable of accurately predicting the week of delivery supporting the identification of a PTB in Brazil. Methods This a population-based study analyzing data from 3,876,666 mothers with live births distributed across the 3,929 Brazilian municipalities. Using indicators comprising delivery characteristics, primary care work processes, and physical infrastructure, and sociodemographic data we applied a machine learning-based approach to estimate the week of delivery at the point of care level. We tested six algorithms: eXtreme Gradient Boosting, Elastic Net, Quantile Ordinal Regression - LASSO, Linear Regression, Ridge Regression and Decision Tree. We used the root-mean-square error (RMSE) as a precision. Findings All models obtained RMSE indexes close to each other. The lower levels of RMSE were obtained using the eXtreme Gradient Boosting approach which was able to estimate the week of delivery within a 2.09 window 95%IC (2.090–2.097). The five most important variables to predict the week of delivery were: number of previous deliveries through Cesarean-Section, number of prenatal consultations, age of the mother, existence of ultrasound exam available in the care network, and proportion of primary care teams in the municipality registering the oral care consultation. Interpretation Using simple data describing the prenatal care offered, as well as minimal characteristics of the pregnant, our approach was capable of achieving a relevant predictive performance regarding the week of delivery. Funding Bill and Melinda Gates Foundation, and National Council for Scientific and Technological Development – Brazil, (Conselho Nacional de Desenvolvimento Cientifico e Tecnologico - CNPQ acronym in portuguese) Support of the research project named: Data-Driven Risk Stratification for Preterm Birth in Brazil: Development of a Machine Learning-Based Innovation for Health Care- Grant: OPP1202186
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    74
    References
    0
    Citations
    NaN
    KQI
    []