Anytime Subgroup Discovery in High Dimensional Numerical Data
2021
Subgroup discovery (SD) enables one to elicit patterns that strongly discriminate a class label. When it comes to numerical data, most of the existing SD approaches perform data discretizations and thus suffer from information loss. A few algorithms avoid such a loss by considering the search space of every interval pattern built on the dataset numerical values and provide an “anytime” property: at any moment, they are able to provide a result that improves over time. Given a sufficient time/memory budget, they may eventually complete an exhaustive search. However, such approaches are often intractable when dealing with high-dimensional numerical data, for instance, when extracting features from real-life multivariate time series. To overcome such limitations, we propose MonteCloPi, an approach based on a bottom-up exploration of numerical patterns with a Monte Carlo Tree Search. It enables to have a better exploration-exploitation trade-off between exploration and exploitation when sampling huge search spaces. Our extensive set of experiments proves the efficiency of MonteCloPi on high-dimensional data with hundreds of attributes. We finally discuss the actionability of discovered subgroups when looking for skill analysis from Rocket League action logs.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
27
References
0
Citations
NaN
KQI