Comparative Analysis of Prediction Techniques to Determine Student Dropout: Logistic Regression vs Decision Trees

2018 
Currently, the detection of students who may drop out from an academic program is a relevant issue for universities, so there are efforts to examine the variables that determine students' drop out. Drop out is defined in different ways, however, all the studies converge in that for a student to drop out a course of study, some variables must be combined. This study presents a comparison of performance indicators of the current drop out model of the Universidad del Bio-Bio (UBB), which is based on logistic regression technique and it is compared with a new model based on decision trees. The new model is obtained through data mining methodologies and it was implemented through the SAP Predictive Analytics tool. To train, validate, and apply the model, real data from the UBB databases were used. The comparison shows that the prediction of student´ drop out of the proposed model obtains an accuracy of 86%, a precision of 97% with an error rate of 14%, better indicators than the current values delivered by the model based on logistic regression. Subsequently, the prediction model obtained was optimized considering other variables, improving even more the prediction indicators. Higher education institutions should take into account the variables that explain the most the phenomenon of student´s drop out to improve the retention of their students.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    2
    Citations
    NaN
    KQI
    []