Evolutionary optimization based pseudo labeling for semi-supervised soft sensor development of industrial processes

2021 
Abstract Data-based soft sensors have been widely applied in industrial processes for enabling online prediction of difficult-to-measure variables. However, there exists a common phenomenon of “unlabeled data rich, but labeled data poor” in many practical processes, which has become the main bottleneck of developing high-performance data-based soft sensors. To address this issue, two novel semi-supervised soft sensor methods, namely evolutionary optimization based pseudo labeling method (EOPL) and ensemble EOPL method (EnEOPL), are proposed. The proposed methods first formulate the issue of pseudo labeling for unlabeled data as an optimization problem, where the labels of unlabeled data (denoting pseudo-labels) serve as the decision variables. Then, an evolutionary optimization approach is used to solve the optimization problem, which utilizes Gaussian process regression (GPR) as the base learner. Next, a new GPR model is built by the enlarged labeled training set which combines the labeled data and high-confidence pseudo-labeled data together. Furthermore, by exploiting ensemble learning framework, EOPL is extended to EnEOPL in order to enhance the prediction performance. Two case studies demonstrate that the proposed methods are superior to traditional pseudo-labeling style semi-supervised methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    0
    Citations
    NaN
    KQI
    []