Building a concept hierarchy automatically and its measuring

2008 
Concept hierarchies are important for generalization in many data mining applications. Abundant algorithms have been proposed for automatic construction of concept hierarchy. A typical application of such algorithms is constructing directories for documents in information retrieval community. However, the research result can not be directly adopted for automatic construction of concept hierarchies for objects with identifiers only, such as items in market basket database where items have no attribute and only similarities between items are available. So, the metrics for directories for documents are not suitable for hierarchies for identifier-only data. In this paper, we propose a measurement that considers the unevenness of similarities among objects in the child nodes. We use the unevenness value to express the balance of concept hierarchies. For constructing a concept hierarchy, we propose a hierarchical clustering with join/merge decision (HCJMD) which is modified from hierarchical agglomerative clustering (HAC).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    5
    Citations
    NaN
    KQI
    []