A new methodology is proposed for monitoring multi- and megavariate systems whose variables present significant levels of autocorrelation. The new monitoring statistics are derived after the preliminary generation of decorrelated residuals in a dynamic principal component analysis (DPCA) model. The proposed methodology leads to monitoring statistics with low levels of serial dependency, a feature that is not shared by the original DPCA formulation and that seriously hindered its dissemination in practice, leading to the use of other, more complex, monitoring approaches. The performance of the proposed method is compared with those of a variety of current monitoring methodologies for large-scale systems, under different dynamical scenarios and for different types of process upsets and fault magnitudes. The results obtained clearly indicate that the statistics based on decorrelated residuals from DPCA (DPCA-DR) consistently present superior performances regarding detection ability and decorrelation power and are also robust and efficient to compute.
Data collected from Industry 4.0 scenarios present a variety of data structures, reflecting the evolution of industrial processes, measurement systems and IT infrastructures ("variety" is actually one of the 4 V's of Big Data, meaning that its existence is widely recognized). Data analytics platforms must adapt to this context and keep the pace of its evolution, in order to continue providing effective solutions to practitioners for dealing with the large data resources now available. In this context, one prevalent feature of industrial data has been largely overlooked: their multiresolution nature. The multiresolution nature of data is directly connected to their granularity in the time domain, an aspect that induces inner dependencies that current frameworks cannot address in a consistent and rigorous way. Furthermore, multiresolution has been often mistaken as a simple multirate scenario, where in fact the meaning of the observations is completely different. In this paper, we highlight such differences and discuss current multiresolution frameworks for effectively handling industrial data sets.
The overwhelming majority of processes taking place in semiconductor manufacturing operate in a batch mode by imposing time-varying conditions to the products in a cyclic and repetitive fashion. These conditions make process monitoring a very challenging task, especially in massive production plants. Among the state-of-the-art approaches proposed to deal with this problem, the so-called multiway methods incorporate the batch dynamic features in a normal operation model at the expense of estimating a large number of parameters. This makes these approaches prone to overfitting and instability. Moreover, batch trajectories are required to be well aligned in order to provide the expected performance. To overcome these issues and other limitations of the conventional methodologies for process monitoring in semiconductor manufacturing, we propose an approach, translation-invariant multiscale energy-based principal component analysis, that requires a much lower number of estimated parameters. It is free of process trajectory alignment requirements and thus easier to implement and maintain, while still rendering useful information for fault detection and root cause analysis. The proposed approach is based on implementing a translation-invariant wavelet decomposition along the time series profile of each variable in one batch. The normal operational signatures in the time-frequency domain are extracted, modeled, and then used for process monitoring, allowing prompt detection of process abnormalities. The proposed procedure was tested with real industrial data and it proved to effectively detect the existing faults as well as to provide reliable indications of their underlying root causes.