Extending statistical data quality improvement with explicit domain models

2014 
Automatic processing of data for the purpose of determining operating states and identifying faults has become essential for many modern industrial systems. Typical sources of this data include hundreds of sensors mounted at the industrial machinery measuring qualities such as temperature, vibration, pressure, and many more. However, sensors are complex technical devices, which means that they can fail and their readings may contain noise or imprecise values. Such low quality data makes it hard to solve the original task of assessing system and process status. We present an approach which brings together several well-known techniques from computer science and statistics and enhances monitoring of technical systems by improving results of detection and correction of data quality issues in sensor data. The application domain and the dependencies between its objects are represented as a knowledge-based model, while statistics identifies data anomalies, such as outlying or missing values, in sensor measurement data. Combining information from the knowledge-based model and statistical computations allows to validate and improve data analysis results. We demonstrate the proposed approach on a real-world industrial use case from the power generation domain. Our evaluation shows that the combined solution improves precision indexes while maintaining high accuracy and recall values.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    4
    Citations
    NaN
    KQI
    []