Cross Validation Component Based Reduction for Divorce Rate Prediction

2021 
Concurring to information from the Centres for Illness Control and Anticipation, instruction and religion are both capable indicators of lasting or dissolving unions. The chance of a marriage finishing in separate was lower for individuals with more knowledge, with over half of relational unions of those who did not complete high school having finished in separate compared with roughly 30 percent of relational unions of college graduates. With this overview, the divorce rate dataset from UCI dataset repository is used for predicting the divorce class target with the following contributions. Firstly, the Divorce rate dataset is subjected with the data cleaning and exploratory data analysis. Secondly, the data set is settled with different classifiers to look at the classification before and after feature scaling. Thirdly, the dataset is processed with various cross validation of training and testing dataset i.e 80:20, 30:70, 40:60, 50:50 to improve the accuracy of all the classifiers. Fourth, the dataset is processed with 15, 20 and 30 components of principal component analysis and then applied with all classifier algorithm to analyze the accuracy of divorce rate prediction. Fifth, the performance analysis is done with precision, recall, accuracy, fscore and running time to infer the classification before and after feature scaling. Experimental results show that the Random Forest classifier is found to have the accuracy of 98% for all PCA reduced dataset with 15, 20 and 30 components. The result shows that Random Forest classifier is found to have the accuracy of 98% for 40:60, 50:50 of training and testing dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []