Methods for Investigating Localized Clustering of Disease

1998 
This excellent book describes and presents the results of an exercise in which five research groups, each with their own approach to the investigation of disease clusters, were presented with 50 simulated datasets with known degrees of underlying clustering. The research groups were asked to apply their approach to each of the datasets and report back their findings, which were then compared with the 'truth' (i.e. the degrees of clustering chosen when generating the simulated dataset). The 50 datasets were generated using the population of children aged 0-14 years in the Yorkshire Health Region (UK) at the 1981 census as the population at risk. Each simulated dataset had 350 cases of disease occurring in this population. Ten of the simulated datasets were generated assuming no underlying clustering while the remainder contained a mixture of clustered and random cases, with varying patterns and degrees of clustering (from lots of small clusters to a few large clusters). The five investigative approaches put to the test in this exercise were: the ISD method, the Ponthoff-Whittinghill method, the ' k nearest neighbour' method, GAM-K and the Newell-Besag method. These methods have varying investigative philosophies and approaches to the measurement of clustering. The ISD and Ponthoff-Whittinghill methods both divide the space occupied by the population into areas (quadrats) and calculate the disease rate within each area. A global test is then applied to these area rates to look for overall evidence of clustering. The ISD approach goes no further than this and does not try to identify individual clusters. Nor does the Ponthoff-Whittinghill attempt to locate individual clusters, but it does try to distinguish between clustering which occurs over small areas and that occurring over large areas. The remaining three methods are all based on calculating distances between cases. The ' k nearest neighbour' approach involves an initial global test for clustering. If this test is positive (i.e. there is evidence that cases are not randomly distributed) it is then possible to try to locate individual clusters. Neither the GAM-K nor the Newell-Besag applies global tests of clustering; both aim to identify local clusters. What was the outcome of the exercise? No clear winner emerged. All five methods produced the occasional false positive result. All five methods were good at detecting clustering based on small numbers of clusters each with a relatively large number of cases. Perhaps unsurprisingly, they were less sensitive in detecting clustering when the number of clusters was relatively large and each cluster was small. Why did I like this book? As an interested but inexpert reader, I like being provided with historical and philosophical perspectives. I also like the fact that presentation of the methods focused substantially on giving the reader an intuitive understanding of what they were up to rather than burying them under an avalanche of mathematics. At the end, I felt that even if I had not grasped the details of the computation, I had understood what each method was trying to do and, with the exception of the GAM-K method, how they went about doing it. Any quibble? Just one. The book provides very little information about the availability of computer software to perform these analyses. Are there commercial packages available that would enable me to analyse a dataset of my own? Are routines available from the research groups who took part in this exercise? This aside, I thoroughly recommend this book to anyone who is contemplating conducting studies of disease clustering or anyone who just likes to follow the literature on clustering and the surrounding philosophical debate.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []