Arbre à nœuds multivariés : le modèle rop (régression optimisée)

2018 
Introduction In a classic tree of the model CART, knots used a unique attribute. Moreover, decision trees are hierarchical method, the pathway between the root and leaves are one-way and the classification of the observations in a leaf is definitive. We propose a new concept of decision tree, including multivariate knots and non-hierarchical pathway. Methods A risk score, including all factors, is calculated for each knot, using a combinatorial approach. Terminal leaves are exclusively constituted by negative observations (“not at risk”) and by the last positive leaf (“at risk”) of the tree. An algorithm based on 5 criterions, selects one combination among those maximize the sum of sensitivity + specificity: identical observations, the minimization of the sum of the absolute values of coefficients, the maximization of the AUC, the minimization of the sum of the absolute values of coefficients, and random selection. The data of the sinking of Titanica ( n  = 1316) were used to compare the performance and the internal validation (bootstrap, n  = 1000) obtained with this new tree and the logistic regression. Results The sensibilities estimated were equal to 84.75% for the logistic model versus 84.76% for the new tree. The specificities estimated were equal to 67.72% for the logistic model versus 67.69% for the new tree. The new tree model was slightly better (21.62% versus 21.69%) than the logistic regression for the criteria of proportion of misclassifications estimated. Conclusion We described a new method of regression-classification including several methodological innovations in methods of classification trees. Applications of this new tree were already published. The research is continuing on the development of random forest built with these trees. The first results are very promising. The first version of the program will be available as a package for R and could be requested.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []