Machine learning to identify persons at high-risk of HIV acquisition in rural Kenya and Uganda

2019 
BACKGROUND: In generalized epidemic settings, strategies are needed to prioritize individuals at higher risk of HIV acquisition for prevention services such as pre-exposure prophylaxis. We used population-level HIV testing data from rural Kenya and Uganda to construct HIV risk scores and assessed their ability to identify seroconversions. METHODS: Between 2013-2017, >75% of residents in 16 communities in the SEARCH Study tested annually for HIV. In this population, we evaluated three strategies for using demographic factors to predict the one-year risk of HIV seroconversion: (1) membership in ≥1 known "Risk Group" (e.g., young woman or HIV-infected spouse); (2) a "Model-based" risk score constructed with logistic regression; (3) a "Machine Learning" risk score constructed with the Super Learner algorithm. We hypothesized Machine Learning would identify high-risk individuals more efficiently (fewer persons targeted for a fixed sensitivity) and with higher sensitivity (for a fixed number of persons targeted) than either other approach. RESULTS: 75,558 HIV-negative persons contributed 166,723 person-years of follow-up; 519 seroconverted. Machine Learning improved efficiency; to achieve a fixed sensitivity of 50%, the Risk Group strategy targeted 42% of the population, Model-based 27%, and Machine Learning 18%. Machine Learning also improved sensitivity; with an upper limit of 45% targeted, the Risk Group strategy correctly classified 58% of seroconversions, Model-based 68%, and Machine Learning 78%. CONCLUSIONS: Machine learning improved classification of individuals at risk of HIV acquisition compared to a model-based approach or reliance on known risk groups, and could inform targeting of prevention strategies in generalized epidemic settings.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    16
    Citations
    NaN
    KQI
    []