cchsflow: an open science approach to transform and combine population health surveys

2021 
Setting The Canadian Community Health Survey (CCHS) is one of the world's largest ongoing cross-sectional population health surveys, with over 130,000 respondents every two years or over 1.1 million respondents since its inception in 2001. While the survey remains relatively consistent over the years, there are differences between cycles that pose a challenge to analyze the survey over time. Intervention A program package called cchsflow was developed to transform and harmonize CCHS variables to consistent formats across multiple survey cycles. An open science approach was used to maintain transparency, reproducibility and collaboration. Outcomes The cchsflow R package uses CCHS survey data between 2001 and 2014. Worksheets were created that identify variables, their names in previous cycles, their category structure, and their final variable names. These worksheets were then used to recode variables in each CCHS cycle into consistently named and labelled variables. Following, survey cycles can be combined. The package was then added as a GitHub repository to encourage collaboration with other researchers. Implication The cchsflow package has been added to the Comprehensive R Archive Network (CRAN) and contains support for over 160 CCHS variables, generating a combined data set of over 1 million respondents. By implementing open science practices, cchsflow aims to minimize the amount of time needed to clean and prepare data for the many CCHS users across Canada.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    1
    Citations
    NaN
    KQI
    []