Penalized Latent Class Model for Clustering with Application to Variable Selection

2017 
Latent class model is becoming a popular clustering algorithm for categorical variables. However when facing modern databases that are characterized by their large number of available variables and that only a subset of all existing variables maybe relevant for the clustering, these models, due to the identifiability conditions, make impossible the fit of model with large number of classes. The clustering task should be therefore made on the basis of the relevant variables, eliminating insignificant variables will also improve at one hand, the clustering results, on the other hand, the interpretation of the resulting classes should be mitigated by the meaning of the selected variables, making essential the selection of relevant variables for clustering. This article present a penalized approach for latent class model that select the relevant variables, for which we propose a modified EM algorithm and a modified BIC to select the hyper parameter and the number of classes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    0
    Citations
    NaN
    KQI
    []