Ridge regression combined with model complexity analysis for near infrared (NIR) spectroscopic model updating

2019 
Abstract Near infrared (NIR) calibration models can be used to predict those samples that fall into the calibration domain. However, unmodeled sources of variance within new samples, such as instrumental drift and sample variations, would result in unreliable predictions of product properties. In this case, the model updating approach will be very important. It involves the recalculation of model coefficients with the addition of a few new samples to the original calibration samples. Considering the cost of collecting new samples and their reference measurements, normally few samples are used for model updating. Therefore, it is necessary to balance the mutual importance of old and new samples by weighting the new samples. Compared with the weight of new samples, the model parameter in the regression method has much more influence on the performance of an updated model. The bias/variance tradeoff (L curve) has been applied to the selection of the model parameter. However, this approach contains a degree of subjectivity and does not always obtain satisfactory models. To solve the model selection problem, a new method named model complexity analysis (MCA) was proposed in this work. According to MCA, the 2-norm of the regression coefficients vector of an updated model ( | | β ∗ | | 2 ) should be smaller than that of the original model ( | | β | | 2 ). The ratio of | | β ∗ | | 2 over | | β | | 2 was defined as e , which should be in the range of 0–1. For a given value of e , the model parameter can be uniquely determined by the following equation: min ( a b s ( | | β ∗ | | 2 − e* | | β | | 2 ) ) . The influence of the number of new samples and their representativity on the selection of e was studied. In this work, ridge regression (RR) was used for model updating, because it is a regression method based on 2-norm constraint. Results show that the proposed method based on MCA could select a reliable RR parameter. RR-MCA shows excellent performance on three NIR datasets used in this work.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    1
    Citations
    NaN
    KQI
    []