Statistically detecting clustering for rare events

2011 
Cluster detection is essential for recognizing design flaws and cryptic common-mode or common-cause dependencies among events such as component failures, but when such events are rare, the uncertainty inherent in sparse datasets makes statistical analysis challenging. Traditional statistical tests for detecting clustering assume asymptotically large sample sizes and are therefore not applicable when data are sparse—as they generally are for rare events. We describe several new statistical tests that can be used to detect clustering of rare events in ordered cells. The new tests employ exact methods based on combinatorial formulations so that they yield exact pvalues and cannot violate their nominal Type I error rates like the traditional tests do. As a result, the new tests are reliable whatever the size of the data set, and are especially useful when data sets are extremely small. We characterize the relative statistical power of the new tests under different kinds of clustering mechanisms and data set configurations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []