Effectiveness of Latent Dirichlet Allocation Model for Semantic Information Retrieval on Malay Document
2018
Current research usually adopts the standard process of Vector Space Model (VSM) in searching and retrieving information on Malay documents. However, this technique is less effective for semantic information retrieval from the collection. The system will only retrieve documents which contain the user's query terms and ignore semantic information among those terms. Therefore, several documents that have similar context are ignored and several document context that share a single term are retrieved. Due to this problem, Latent Dirichlet Allocation (LDA) model is applied for semantic information retrieval on Malay documents. An experiment was illustrated based on 6 queries text and 50 Hadith documents translated in Malay language, composed of Shahih Bukhari collections. Experimental results proved that the LDA model gives promising results in retrieving semantic information in Malay translated Hadith documents compare to existing techniques. Some limitation from this study can be explored for future work in order to improve the effectiveness of the retrieval results. Overall, LDA is an effective method for semantic information retrieval on Malay document, thus, it can help people to easily search and retrieve semantic information from Malay documents.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
3
Citations
NaN
KQI