It is important to reiterate a point from the first chapter. This book is NOT about practical skills honed by repetition – such as plumbing, bridge building or surgery – nor is it about judgements related to such skill-based tasks. Rather, it deals with technical estimates and predictions of unknown quantities or future events made by people whose standing is defined by their qualifications, experience and status among their peers.
Background: The events of 9/11 and the October 2002 National Intelligence Estimate on Iraq's Continuing Programs for Weapons of Mass Destruction precipitated fundamental changes within the United States Intelligence Community. As part of the reform, analytic tradecraft standards were revised and codified into a policy document - Intelligence Community Directive (ICD) 203 - and an analytic ombudsman was appointed in the newly created Office for the Director of National Intelligence to ensure compliance across the intelligence community. In this paper we investigate the untested assumption that the ICD203 criteria can facilitate reliable evaluations of analytic products. Methods: Fifteen independent raters used a rubric based on the ICD203 criteria to assess the quality of reasoning of 64 analytical reports generated in response to hypothetical intelligence problems. We calculated the intra-class correlation coefficients for single and group-aggregated assessments. Results: Despite general training and rater calibration, the reliability of individual assessments was poor. However, aggregate ratings showed good to excellent reliability. Conclusion: Given that real problems will be more difficult and complex than our hypothetical case studies, we advise that groups of at least three raters are required to obtain reliable quality control procedures for intelligence products. Our study sets limits on assessment reliability and provides a basis for further evaluation of the predictive validity of intelligence reports generated in compliance with the tradecraft standards.
Abstract We address criticism that the Transport, Establishment, Abundance, Spread, Impact (TEASI) framework does not facilitate objective mapping of risk assessment methods nor defines best practice. We explain why TEASI is appropriate for mapping, despite inherent challenges, and how TEASI offers considerations for best practices, rather than suggesting one best practice.
Abstract Minimum convex polygons (convex hulls) are an internationally accepted, standard method for estimating species' ranges, particularly in circumstances in which presence‐only data are the only kind of spatially explicit data available. One of their main strengths is their simplicity. They are used to make area statements and to assess trends in occupied habitat, and are an important part of the assessment of the conservation status of species. We show by simulation that these estimates are biased. The bias increases with sample size, and is affected by the underlying shape of the species habitat, the magnitude of errors in locations, and the spatial and temporal distribution of sampling effort. The errors affect both area statements and estimates of trends. Some of these errors may be reduced through the application of α‐hulls, which are generalizations of convex hulls, but they cannot be eliminated entirely. α‐hulls provide an explicit means for excluding discontinuities within a species range. Strengths and weaknesses of alternatives including kernel estimators were examined. Convex hulls exhibit larger bias than α‐hulls when used to quantify habitat extent and to detect changes in range, and when subject to differences in the spatial and temporal distribution of sampling effort and spatial accuracy. α‐hulls should be preferred for estimating the extent of and trends in species' ranges.
Abstract We examined the trade-off between the cost of response redundancy and the gain in output quality on the popular crowdsourcing platform Mechanical Turk, as a partial replication of Kosinski et al. (2012) who demonstrated a significant improvement in performance by aggregating multiple responses through majority vote. We submitted single items from a validated intelligence test as Human Intelligence Tasks (HITs) and aggregated the responses from “virtual groups” consisting of 1 to 24 workers. While the original study relied on resampling from a relatively small number of responses across a range of experimental conditions, we randomly and independently sampled from a large number of HITs, focusing only on the main effect of group size. We found that – on average – a group of six MTurkers has a collective IQ one standard deviation above the mean for the general population, thus demonstrating a “wisdom of the crowd” effect. The relationship between group size and collective IQ was characterised by diminishing returns, suggesting moderately sized groups provide the best return on investment. We also analysed performance of a smaller subset of workers who had each completed all 60 test items, allowing for a direct comparison between a group’s collective IQ and the individual IQ of its members. This demonstrated that randomly selected groups collectively equalled the performance of the best-performing individual within the group. Our findings support the idea that substantial intellectual capacity can be gained through crowdsourcing, contingent on moderate redundancy built into the task request.
Summary Victoria's Department of Conservation and Natural Resources is a natural resource agency charged with managing potentially conflicting forest uses and one tool employed for multiple use planning is the linear programming model, FORPLAN. This paper reviews the value of FORPLAN for integrated forest planning and wildlife conservation. As it is currently implemented in Victoria, FORPLAN has serious limitations when used as the sole planning tool for wildlife management. This is because there are practical limits on the construction of sufficiently complex models, it cannot model stochastic processes, the minimum spatial resolution of available data is too coarse, and it does not directly use spatial information. Over-confidence in the projections and expectations of models used in planning must be avoided. An array of tools is used for forest planning including FORPLAN, modelling environments of other kinds, iterative research and public participation. A model is implemented in FORPLAN that uses numbers of hollow bearing trees which are potential nest sites rather than yields of different species. Although the model's assumptions are unrealistic, the exercise was useful because a scarcity of data was highlighted in the classification of old forests used for management planning, and in the resource information for mixed species forests. Wildlife biologists and planners must be involved in, and understand the construction of, FORPLAN models. The most important uses of FORPLAN for wildlife planning are to aid the understanding of forest planning problems, to identify which pieces of critical information are not known, and to provide a forum for the statement of assumptions. The ultimate success of FORPLAN will depend on the way it is used to address forest issues.
Abstract Spatial analysis of vegetation and soil data collected from 64 sites in southern Western Australia suggests that both soil characteristics and geographic distance between sites are important predictors of the floristic resemblance of sites on a regional scale. These two factors are largely independent, a finding that may reflect recent, rapid speciation in the study area postulated in other studies. Spatial patterns of plant guilds suggest geographic replacement, and contingent exclusion may be an important mechanism maintaining species‐richness. Existing soil and vegetation maps used to delineate reserve boundaries are found to be accurate, although the soil maps include information on vegetation patterns, independent of information on soil patterns. Within broad vegetation formations, there is some correlation of floristics in mallee stands with soil characteristics. Ordination analysis indicates a soil moisture/nutrient axis. In contrast, there are few important correlations of floristics with soil characters in kwongan (sand heath) or halophytic vegetation. Geographic distance between sites is a much more important factor. The absence of edaphic correlations implies that the observed geographic replacement of species between sites is a historical legacy, the result of recent, rapid speciation in the spatially patchy environment. It is concluded that if reserves in the region are to conserve the flora, especially rare species, the reserve system should include replicates of stands within the same broad formations and soil types at intervals less than 15 km, the minimum scale of resolution of this study.
Abstract: Many different systems are used to assess levels of threat faced by species. Prominent ones are those used by the World Conservation Union, NatureServe, and the Florida Game and Freshwater Fish Commission (now the Florida Fish and Wildlife Conservation Commission). These systems assign taxa a threat ranking by assessing their demographic and ecological characteristics. These threat rankings support the legislative protection of species and guide the placement of conservation programs in order of priority. It is not known, however, whether these assessment systems rank species in a similar order. To resolve this issue, we assessed 55 mainly vertebrate taxa with widely differing life histories under each of these systems and determined the rank correlations among them. Moderate, significant positive correlations were seen among the threat rankings provided by the three systems (correlations 0.58–0.69). Further, the threat rankings for taxa obtained using these systems were significantly correlated to their rankings based on predicted probability of extinction within 100 years as determined by population viability analysis (correlations 0.28–0.37). The different categorization systems, then, yield related but not identical threat rankings, and these rankings are associated with predicted extinction risk.