Solving the Sample Size Problem for Resource Selection Analysis

Garrett M. Street,J. R. Potts,Luca Börger,James C. Beasley,S. Demarais,John M. Fryxell,Philip D. McLoughlin,K. Monteith,Prokopenko Cm,Milton Cezar Ribeiro,Arthur R. Rodgers,B. Strickland,F. M. van Beest,D. Bernasconi,Larissa T. Beumer,Guha Dharmarajan,S Dwinnell,D. Keiter,Alexine Keuroghlian,L. Newediuk,J. Oshima,Olin E. Rhodes,Peter E. Schlichting,Niels Martin Schmidt,E. Vander Wal

Solving the Sample Size Problem for Resource Selection Analysis

2021

O_LISample size sufficiency is a critical consideration for conducting Resource-Selection Analyses (RSAs) from GPS-based animal telemetry. Cited thresholds for sufficiency include a number of captured animals M [≥] 30 and as many relocations per animal N as possible. These thresholds render many RSA-based studies misleading if large sample sizes were truly insufficient, or unpublishable if small sample sizes were sufficient but failed to meet reviewer expectations. C_LIO_LIWe provide the first comprehensive solution for RSA sample size by deriving closed-form mathematical expressions for the number of animals M and the number of relocations per animal N required for model outputs to a given degree of precision. The sample sizes needed depend on just 2 biologically meaningful quantities: habitat selection strength and a novel measure of landscape complexity, which we define rigorously. The mathematical expressions are calculable for any environmental dataset at any spatial scale and are applicable to any study involving resource selection (including sessile organisms). We validate our analytical solutions using globally relevant empirical data including 5,678,623 GPS locations from 511 animals from 10 species (omnivores, carnivores, and herbivores living in boreal, temperate, and tropical forests, montane woodlands, swamps, and arctic tundra). C_LIO_LIOur analytic expressions show that the required M and N must decline with increasing selection strength and increasing landscape complexity, and this decline is insensitive to the definition of availability used in the analysis. Our results contradict conventional wisdom by demonstrating that the most biologically relevant effects on the utilization distribution (i.e. those landscape conditions with the greatest absolute magnitude of resource selection) can often be estimated with far fewer data than is commonly assumed. C_LIO_LIWe identify several critical steps in implementing these equations, including (i) a priori selection of expected model coefficients, and (ii) sampling intensity for background (absence/pseudo-absence) data within a given definition of availability. We show that random sampling of background data violates the underlying mathematics of RSA, leading to incorrect values for necessary M and N and potentially incorrect RSA model outputs. We argue that these equations should be a mandatory component for all future RSA studies. C_LI

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations