Comparing machine learning algorithms in predicting thermal sensation using ASHRAE Comfort Database II

2020 
Abstract Predicting building occupants’ thermal comfort via machine learning (ML) is a hot research topic. Many algorithms and data processing methods have been applied to predict thermal comfort indices in different contexts. But few studies have systematically investigated how different algorithms and data processing methods can influence the prediction accuracy. In this study, we first summarized the recent literature from perspectives of predicted comfort indices, algorithms applied, input features, data sources, sample size, training proportion, predicting accuracy, etc. Then, we applied nine ML algorithms and three data sampling methods to predict the 3-point and 7-point thermal sensation vote (TSV) in ASHRAE Comfort Database II. The results show that with an accuracy of 66.3% and 61.1% for 3-point and 7-point TSV respectively, Random Forest (RF) has the best performance among the tested algorithms. Compared to the Predicted Mean Vote (PMV) model, ML TSV models generally have higher accuracy in TSV prediction. Based on feature importance analysis, the air temperature, humidity, clothing, air velocity, age, and metabolic rate are the top six important features for TSV prediction. The RF algorithm can achieve 63.6% overall accuracy in TSV prediction with the top three features, which is only 2.6% lower than involving 12 input features. Further, this paper addressed other common considerations in ML comfort model establishment such as tuning hyperparameters, splitting of training and testing data, and encoding methods. We also provided Python and R programming codes and packages as appendixes, which can be a good reference for future studies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    41
    References
    32
    Citations
    NaN
    KQI
    []