Piecewise linear approximation of streaming time series data with max-error guarantees

2015 
Given a time series S = ((x 1 , y 1 ), (x 2 , y 2 ), …) and a prescribed error bound e, the piecewise linear approximation (PLA) problem with max-error guarantees is to construct a piecewise linear function f such that |f(x i )-y i | ≤ e for all i. In addition, we would like to have an online algorithm that takes the time series as the records arrive in a streaming fashion, and outputs the pieces of f on-the-fly. This problem has applications wherever time series data is being continuously collected, but the data collection device has limited local buffer space and communication bandwidth, so that the data has to be compressed and sent back during the collection process. Prior work addressed two versions of the problem, where either f consists of disjoint segments, or f is required to be a continuous piecewise linear function. In both cases, existing algorithms can produce a function f that has the minimum number of pieces while meeting the prescribed error bound e. However, we observe that neither minimizes the true representation size of f, i.e., the number of parameters required to represent f. In this paper, we design an online algorithm that generates the optimal PLA in terms of representation size while meeting the prescribed max-error guarantee. Our experiments on many real-world data sets show that our algorithm can reduce the representation size of f by around 15% on average compared with the current best methods, while still requiring O(1) processing time per data record and small space.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    39
    Citations
    NaN
    KQI
    []