Inferring the main drivers of SARS-CoV-2 global transmissibility by feature selection methods

2021 
Identifying the main environmental drivers of SARS-CoV-2 transmissibility in the population is crucial for understanding current and potential future outbursts of COVID-19 and other infectious diseases. To address this problem, we concentrate on the basic reproduction number $R_0$, which is not sensitive to testing coverage and represents transmissibility in an absence of social distancing and in a completely susceptible population. While many variables may potentially influence $R_0$, a high correlation between these variables may obscure the result interpretation. Consequently, we combine Principal Component Analysis with feature selection methods from several regression-based approaches to identify the main demographic and meteorological drivers behind $R_0$. We robustly obtain that country's wealth/development (GDP per capita or Human Development Index) is the most important $R_0$ predictor at the global level, probably being a good proxy for the overall contact frequency in a population. This main effect is modulated by built-up area per capita (crowdedness in indoor space), onset of infection (likely related to increased awareness of infection risks), net migration, unhealthy living lifestyle/conditions including pollution, seasonality, and possibly BCG vaccination prevalence. Also, we argue that several variables that significantly correlate with transmissibility do not directly influence $R_0$ or affect it differently than suggested by naive analysis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    55
    References
    0
    Citations
    NaN
    KQI
    []