Predictive models in ecology: Comparison of performances and assessment of applicability
2006
Abstract Ecological systems are governed by complex interactions which are mainly nonlinear. In order to capture the inherent complexity and nonlinearity of ecological, and in general biological systems, empirical models recently gained popularity. However, although these models, particularly connectionist approaches such as multilayered backpropagation networks, are commonly applied as predictive models in ecology to a wide variety of ecosystems and questions, there are no studies to date aiming to assess the performance, both in terms of data fitting and generalizability, and applicability of empirical models in ecology. Our aim is hence to provide an overview for nature of the wide range of the data sets and predictive variables, from both aquatic and terrestrial ecosystems with different scales of time-dependent dynamics, and the applicability and robustness of predictive modeling methods on such data sets by comparing different empirical modeling approaches. The models used in this study range from predicting the occurrence of submerged plants in shallow lakes to predicting nest occurrence of bird species from environmental variables and satellite images. The methods considered include k-nearest neighbor (k-NN), linear and quadratic discriminant analysis (LDA and QDA), generalized linear models (GLM) feedforward multilayer backpropagation networks and pseudo-supervised network ARTMAP. Our results show that the predictive performances of the models on training data could be misleading, and one should consider the predictive performance of a given model on an independent test set for assessing its predictive power. Moreover, our results suggest that for ecosystems involving time-dependent dynamics and periodicities whose frequency are possibly less than the time scale of the data considered, GLM and connectionist neural network models appear to be most suitable and robust, provided that a predictive variable reflecting these time-dependent dynamics included in the model either implicitly or explicitly. For spatial data, which does not include any time-dependence comparable to the time scale covered by the data, on the other hand, neighborhood based methods such as k-NN and ARTMAP proved to be more robust than other methods considered in this study. In addition, for predictive modeling purposes, first a suitable, computationally inexpensive method should be applied to the problem at hand a good predictive performance of which would render the computational cost and efforts associated with complex variants unnecessary.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
64
References
27
Citations
NaN
KQI