Data balancing for thermal comfort datasets using conditional wasserstein GAN with a weighted loss function

2021 
The development of various machine learning methods helps to improve the performance of the thermal comfort estimation. However, thermal comfort datasets are usually unbalanced because hot/cold environments rarely appear in an air-conditioned environment. Imbalanced datasets lead to biased estimation, which is not helpful for users in environments that rarely appear. Therefore, many researchers have applied data augmentation for rare samples to balance thermal comfort datasets. Nevertheless, the imbalance in the original dataset still leads to biased data generation. In this paper, we propose a data balancing method for thermal comfort datasets using conditional Wasserstein GAN with a weighted loss function. We utilize the architecture of comfortGAN, which is a learning-based data balancing method for thermal comfort datasets. Our main contribution is an introduction of the loss function considering the difference among the numbers of samples in all classes to avoid biased training towards major classes. We evaluate the proposed method based on six metrics, including metrics for the generated dataset's variability and the effectiveness of the data balancing for the thermal comfort estimation. The result shows the proposed method enhances the performance of the estimator because it can generate the rare samples conserving the characteristic of the original dataset distributions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []