The objective of this work is to propose a text mining based approach that supports Human Resources Management (HRM) in detecting subjectivity in staff performance appraisals. The approach detects three domain-driven clues of subjectivity in reviews, where each clue represents a level of subjectivity. A considerable effort has been directed to detecting subjectivity in opinion reviews. However, to the best of our knowledge, there is no previous work that detects subjectivity in staff appraisals. For proving our approach, we applied it to the teachers' appraisals of the Palestinian government. According to our experiments, we found that the approach is effective regarding our evaluations, where we used: expert opinion, precision, recall, accuracy and F-measure. In the first level, we reached the F-measure of 88%, in the second level, we used expert staff's opinion, where they decided the percentage of duplication to be 85% and in the third level, we achieved the best average F-measure of 84%.
Usability is critical for any system, but in software it is one of the most important features. In fact, one of the main reasons for software failure is the system lacking to achieve users specified goals and satisfaction. For this reason, usability evaluation is becoming an important part of software development. Software usability evaluation can be costly in terms of time and human. Therefore, automation is promising way to augment existing approaches especially if the evaluation is subjective where the usability concentrated about user's opinion. This paper proposes to use opinion mining as an automatic technique to evaluate subjective usability. Opinion mining is a research subtopic of data mining aiming to automatically obtain useful opinioned knowledge in subjective texts. We propose a novel model to extract knowledge from opinions to improve subjective software usability. This is the first time opinion mining used in software usability. To evaluate our proposed model, a set of experiments was designed and conducted and we got an average accuracy of 85.41%. Also, we propose to use graphics to visualize user's opinion in software and to compare the usability of two software.
For the rapidly increasing amount of information available on the Internet, little quality control exists, especially over the user-generated content. Manually scanning through large amounts of user-generated content is time-consuming and sometime impossible. In this case, opinion mining is a better alternative. Although, it is recognized that the opinion reviews contain valuable information for a variety of applications, the lack of quality control attracts spammers who have found many ways to draw their benefits from spamming. Moreover, the spam detection problem is complex because spammers always invent fresh methods that can't be easily recognized. Therefore, there is a need to develop a new approach that works to identify spam in opinion reviews. We have some in English; we need one in Arabic language in order to identify Arabic spam reviews. To the best of our knowledge, there is still no published study to detect spam in Arabic reviews. In this research, we propose a new approach for performing spam detection in Arabic opinion reviews by merging methods from data mining and text mining in one mining classification approach. Our work is based on the state-of-the-art achievements in the Latin-based spam detection techniques keeping in mind the specific nature of the Arabic language. In addition; we overcome the drawbacks of the class imbalance problem by using sampling techniques. The experimental results show that the proposed approach is effective in identifying Arabic spam opinion reviews. Our designed machine learning achieves significant improvements. In the best case, our F-measure is improved to 99.59%.
Healthcare systems generate a huge data collected from medical tests.Data mining is the computing process of discovering patterns in large data sets such as medical examinations.Blood diseases are not an exception; there are many test data can be collected from their patients.In this paper, we applied data mining techniques to discover the relations between blood test characteristics and blood tumor in order to predict the disease in an early stage, which can be used to enhance the curing ability.We conducted experiments in our blood test dataset using three different data mining techniques which are association rules, rule induction and deep learning.The goal of our experiments is to generate models that can distinguish patients with normal blood disease from patients who have blood tumor.We evaluated our results using different metrics applied on real data collected from Gaza European hospital in Palestine.The final results showed that association rules could give us the relationship between blood test characteristics and blood tumor.Also, it demonstrated that deep learning classifiers has the best ability to predict tumor types of blood diseases with an accuracy of 79.45%.Also, rule induction gave us an explanation of rules that describes both tumor in blood and normal hematology.
In this paper, we apply different data mining approaches for the purpose of examining and predicting students' dropouts through their university programs.For the subject of the study we select a total of 1290 records of computer science students Graduated from ALAQSA University between 2005 and 2011.The collected data included student study history and transcript for courses taught in the first two years of computer science major in addition to student GPA , high school average , and class label of (yes ,No) to indicate whether the student graduated from the chosen major or not.In order to classify and predict dropout students, different classifiers have been trained on our data sets including Decision Tree (DT), Naive Bayes (NB).These methods were tested using 10-fold cross validation.The accuracy of DT, and NlB classifiers were 98.14% and 96.86% respectively.The study also includes discovering hidden relationships between student dropout status and enrolment persistence by mining a frequent cases using FP-growth algorithm.
In this paper, we present a combined approach that automatically extracts opinions from Arabic documents. Most research efforts in the area of opinion mining deal with English texts and little work with Arabic text. Unlike English, from our experiments, we found that using only one method on Arabic opinioned documents produce a poor performance. So, we used a combined approach that consists of three methods. At the beginning, lexicon based method is used to classify as much documents as possible. The resultant classified documents used as training set for maximum entropy method which subsequently classifies some other documents. Finally, k-nearest method used the classified documents from lexicon based method and maximum entropy as training set and classifies the rest of the documents. Our experiments showed that in average, the accuracy moved (almost) from 50% when using only lexicon based method to 60% when used lexicon based method and maximum entropy together, to 80% when using the three combined methods.