Forecasting business failure: the use of nearest-neighbour support vectors and correcting imbalanced samples - evidence from the Chinese hotel industry.

2012 
Previous studies on firm failure prediction (FFP) have chiefly addressed predictions based on balanced datasets without considering that the real-world target population consists of imbalanced data. The current study investigates tourism FFP based on the imbalanced data of Chinese listed companies in the hotel industry. The imbalanced dataset was collected and represented in terms of significant financial ratios, and a new up-sampling approach and forecasting method were proposed to correct imbalanced samples. To balance the imbalanced dataset, the up-sampling method generates new minority samples according to random percentage distances from each minority sample to its nearest neighbour (NN). The NNs of unlabelled samples are retrieved from the balanced dataset to produce a knowledge base of nearest-neighbour support vectors, from which base support vector machines (SVMs) are generated and assembled. Empirical results indicate that the proposed sampling approach helped models produce more accurate performance on minority samples, with accuracy rates in excess of 90 per cent. This method of using nearest-neighbour support vectors and correcting imbalanced samples is useful in controlling risk in tourism management.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    54
    References
    53
    Citations
    NaN
    KQI
    []