Data-generating models under which the random forest algorithm performs badly.

José A Ferreira

Data-generating models under which the random forest algorithm performs badly.

2020

José A Ferreira

Examples are given of data-generating models under which some versions of the random forest algorithm may fail to be consistent or at least may be extremely slow to converge to the optimal predictor. Evidence provided for these properties is based on partly intuitive and partly rigorous arguments and on numerical experiments. Although one can always choose a model under which random forests perform very badly, in each case simple methods based on statistics of `variable use' and `variable importance' can be used to construct a better predictor based on a sort of mixture of random forests.

Keywords:

Algorithm
Machine learning
Artificial intelligence
Random forest
sort
Mathematics

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations