Revisiting linear regression to test agreement in continuous predicted-observed datasets

2021 
Abstract CONTEXT In agricultural research and related disciplines, using a scatter plot and a regression line to visually and quantitatively assess agreement between model predictions and observed values is an extensively adopted approach, even more within the simulation modeling community. However, linear model fit, use, and interpretation are still controversial in the literature. OBJECTIVE The overall goal of this research is to evaluate the usefulness of a symmetric regression line to test agreement on predicted-observed datasets. The specific aims of this study are to: i) discuss the selection of a regression model to fit a line to the predicted-observed scatter, and ii) provide a geometric interpretation of the regression line, decomposing the prediction error into lack of accuracy and lack of precision components, via utilization of illustrative field crop datasets. METHODS This study tested and contrasted three alternative linear regression models (Ordinary Least Squares -OLS-, Major Axis -MA-, and Standardized Major Axis -SMA-) in terms of assumptions, loss functions, parameters estimates, and model interpretation for the predicted-observed case. RESULTS AND CONCLUSIONS When the uncertainty of predictions and observations are unknown, the SMA represents the most appropriate approach to fit a symmetric-line describing the bivariate predicted-observed scatter. The SMA-line serves as a reference to estimate a weighed difference between predictions and observations. Moreover, this symmetric regression can assist in the decomposition of the square error into additive components related to both lack of accuracy and precision. In summary, the SMA regression tackles the axis orientation problem of the traditional OLS (y vs. x or x or y) and allows to identify error sources that are meaningful to the user. SIGNIFICANCE This work offers a novel and simple perspective about the use of linear regression to assess simulation models performance. In order to assist potential users, we also provide a tutorial to compute the proposed assessment of agreement using R-software.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    54
    References
    0
    Citations
    NaN
    KQI
    []