Utilization of Weight Allocation in Tf-Idf Environment for Noise Detection Enhancement
2016
Social media data mining has gained significance in the recent past. This owes to the importance of establishing hidden patterns in the social media data that can be used in digital marketing strategies. It helps the marketers to segment customers according to their demographics and behavioral characteristics such that it becomes easier to target these customer segments with advertisement messages suitable for their cluster. As such, it becomes paramount to discover and eliminate any data that may not influence customer buying trends. This constitutes noise removal. Weight allocation then becomes crucial in identifying keywords from social media data that can help in the clustering process. In this paper, weight allocation was applied in a term frequency-inverse document frequency (TF-IDF) environment to recognize noisy data and remove it before the social media data can be exposed to further analysis. In this approach, a word that appears more frequently in a given document but rarely in the whole document collection was given a higher weight than that word that appears virtually in all documents.
Keywords:
- Correction
- Cite
- Save
- Machine Reading By IdeaReader
3
References
0
Citations
NaN
KQI