On improving ROCK-based clustering for categorical data: student research abstract

Riccardo Cappuzzo

On improving ROCK-based clustering for categorical data: student research abstract

2018

Riccardo Cappuzzo

In the field of data mining, the analysis of categorical data (i.e., data that can assume a limited number of values, such as names) is particularly challenging due to the lack of implicit geometrical properties. Clustering of categorical data is becoming increasingly important, since non-numerical data are ubiquitous and clustering can be used, for example, to optimize an anonymization process [1], or to perform anomaly detection, or in any application where there is the need to automatically recognize the intrinsic structure of data. Various algorithms have been proposed for clustering this kind of data (see [3]), such as ROCK (RObust Clustering using linKs) [4].

Keywords:

Categorical variable
Anomaly detection
Semantic similarity
Cluster analysis
Artificial intelligence
Pattern recognition
Computer science
student research
Data mining

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations