NARPCA: Neural Accumulate-Retract PCA for Low-Latency High-Throughput Processing on Datastreams

2019 
The increasingly interconnected and instrumented world, provides a deluge of data generated by multiple sensors in the form of continuous streams. Efficient stream processing needs control over the number of useful variables. This is because maintaining data structure in reduced sub-spaces, given that data is generated at high frequencies and is typically follows non-stationary distributions, brings new challenges for dimensionality reduction algorithms. In this work we introduce NARPCA, a neural network streaming PCA algorithm capable to explain the variance-covariance structure of a set of variables in a stream through linear combinations. The essentially neural-based algorithm is leveraged by a novel incremental computation method and system operating on data streams and capable of achieving low-latency and high-throughput when learning from data streams, while maintaining resource usage guarantees. We evaluate NARPCA in real-world data experiments and demonstrate low-latency (millisecond level) and high-throughput (thousands events/second) for simultaneous eigenvalues and eigenvectors estimation in a multi-class classification task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []