Performance Analysis of RDBMS and Hadoop Components with Their File Formats for the Development of Recommender Systems
2018
A recommender system is a software that can suggest users through prediction based on their previous data usage in the shortest amount of time. Present recommender systems are designed using complex techniques like collaborative filtering, content-based filtering etc. but a similar system can be built by applying complex queries using different query tools. Performance of these query tools depends upon various factors like data size, file formats of the dataset, aggregate search etc. In this paper, we compare four query tools like Hive, Impala, SparkSQL and MySQL to design a fast and an efficient recommender system. Analysis of these tools is done by comparing the execution time of complex queries on data stored in different file formats like text, CSV, AVRO, PARQUET, RC and ORC. The results obtained indicate that a fast recommender system can be built using a query tool like Impala on a dataset saved in AVRO file format.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
4
References
1
Citations
NaN
KQI