Collection, Processing and Analysis of Heterogeneous Data Coming from Spanish Hospitals in the Context of COVID-19

2021 
The COVID-19 pandemic has already caused more than 150,000,000 cases worldwide. In Spain this has lead to a massive and simultaneous saturation of all sanitary regions. Coherently, the quick and consistent understanding of the COVID-19 disease requires of the combined analysis of thousands of medical records generated by dozens of different institutions. In the context of the publicly funded CIBERES-UCI-COVID project, we have gathered, cleaned and preprocessed data from heterogeneous sources - more than 30 hospitals, with different data entry systems - in order to produce a unified database, of more than 6.000 patients, that is used in several clinical studies being carried by different multidisciplinary groups. In this paper, we identify the complexities we encountered, the solutions we applied, and we summarise the statistical and machine learning techniques we have applied for the studies. © 2021 The authors and IOS Press.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []