Primary Steps in Analyzing Data: Tasks and Tools for a Systematic Data Exploration

2020 
Understanding the structure, basic properties and relationships in a given dataset is a fundamental prerequisite for an appropriate statistical analysis. Here, we highlight the major principles of data exploration and provide a roadmap for a systematic and reproducible analysis based on key questions. Using an exemplary dataset on throughfall measurements, we demonstrate how several techniques can be used and evaluated in order to address common tasks of data analysis such as understanding the structure of the dataset, detecting temporal and spatial dependence among observations, identifying outliers and influential observations, checking normality and homogeneity of model residuals and exploring the relationships of variables. Finally, it is discussed when and when not data transformations can be used as potential actions to overcome restrictions in the application of a statistical method. The electronic supplement offers the sample dataset as well as fully documented computer code, which aims to serve as a guideline for conducting an exploratory data analysis using the statistical software environment R.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    2
    Citations
    NaN
    KQI
    []