How Entropic Regression Beats the Outliers Problem in Nonlinear System Identification.

2019 
System identification (SID) is central in science and engineering applications whereby a general model form is assumed, but active terms and parameters must be inferred from observations. Most methods for SID rely on optimizing some metric-based cost function that describes how a model fits observational data. A commonly used cost function employs a Euclidean metric and leads to a least squares estimate, while recently it becomes popular to also account for model sparsity such as in compressed sensing and Lasso. While the effectiveness of these methods has been demonstrated in previous studies including in cases where outliers exist, it remains unclear whether SID can be accomplished under more realistic scenarios where each observation is subject to non-negligible noise and sometimes contaminated by large noise and outliers. We show that existing sparsity-focused methods such as compressive sensing, when used applied in such scenarios, can result in "over sparse" solutions that are brittle to outliers. In fact, metric-based methods are prone to outliers because outliers by nature have an unproportionally large influence. To mitigate such issues of large noise and outliers, we develop an Entropic Regression approach for nonlinear SID, whereby true model structures are identified based on relevance in reducing information flow uncertainty, not necessarily (just) sparsity. The use of information-theoretic measures as opposed to a metric-based cost function has a unique advantage, due to the asymptotic equipartition property of probability distributions, that outliers and other low-occurrence events are conveniently and intrinsically de-emphasized.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    0
    Citations
    NaN
    KQI
    []