Influence of variable selection and sample size on classification results with classy

1989 
Abstract To investigate the influence of selection of variables and sample size on the performance of the multivariate classification method CLASSY, these parameters were varied systematically. In addition to the usual classificatory performance, the reliability of the assigned probabilities is considered. A small training set with only about five variables was shown to yield satisfactory results. After the variables had been ranked according to decreasing utility for the classification, the inclusion of many variables made the probabilities more unreliable. This over-confidence was not easily remedied by adding training objects. The classificatory ability was not affected by using more variables than necessary.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    1
    Citations
    NaN
    KQI
    []