DPP-VSE: Constructing a Variable Selection Ensemble by Determinantal Point Processes

2021 
Abstract As an effective tool to analyze high-dimensional data, variable selection is playing an increasingly important role in many fields. In recent years, variable selection ensembles (VSEs) have gained much interest of researchers due to their great potential to improve selection accuracy and to stabilize the results of traditional selection methods. Inspired by one common practice of Bayesian methods, we propose in this paper a novel technique named DPP-VSE to build a VSE by utilizing determinantal point processes (DPP) to infer a distribution of model size. By sampling from this distribution, DPP-VSE has the advantage that the number of variables for a base learner to select can be automatically determined. In contrast to other VSE strategies, it has fewer parameters for users to specify. The experiments conducted with both synthetic and real data illustrate that DPP-VSE performs best under most circumstances when being evaluated with several metrics. Hence, DPP-VSE can be seen as an effective and easy to use method to solve variable selection problems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    0
    Citations
    NaN
    KQI
    []