An enhanced relevance criterion for more concise supervised pattern discovery
2012
Supervised local pattern discovery aims to find subsets of a database with a high statistical unusualness in the distribution of a target attribute. Local pattern discovery is often used to generate a human-understandable representation of the most interesting dependencies in a data set. Hence, the more crisp and concise the output is, the better. Unfortunately, standard algorithm often produce very large and redundant outputs. In this paper, we introduce delta-relevance, a definition of a more strict criterion of relevance. It will allow us to significantly reduce the output space, while being able to guarantee that every local pattern has a delta-relevant representative which is almost as good in a clearly defined sense. We show empirically that delta-relevance leads to a considerable reduction of the amount of returned patterns. We also demonstrate that in a top- k setting, the removal of not delta-relevant patterns improves the quality of the result set.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
25
References
12
Citations
NaN
KQI