Quasi-Bernoulli Stick-breaking: Infinite Mixture with Cluster Consistency.

2020 
In mixture modeling and clustering application, often, the number of components is not known. The stick-breaking model is an appealing construction that assumes infinitely many components, while shrinking most of the redundant weights to near zero. However, it has been discovered that such shrinkage is unsatisfactory: even when the component distribution is correctly specified, small and spurious weights would appear and give an inconsistent estimate on the cluster number. In this article, we propose a simple solution that directly controls the redundant weights: when breaking each stick, the remaining proportion is further multiplied by a quasi-Bernoulli random variable, supported at one and a positive constant close to zero. This effectively shrinks the redundant weights much closer to zero, leading to a consistent estimate on the cluster number in theory; at the same time, it avoids the singularity at zero weight, maintaining support in the infinite-dimensional space and enjoying efficient posterior computation. Compared to existing infinite mixture models, our model demonstrates superior performances in the simulations and data application, showing a substantial reduction in the number of clusters.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []