Patterns of life in temporal data: indexing and hashing for fast and relevant data retrieval
2014
As datasets with time-series records, such as computer logs or financial transactions, grow larger, indexing
solutions are needed that can efficiently filter out irrelevant records while retrieving most of relevant ones.
These methods must capture essential temporal properties present in the data, and provide a scalable way to
generate the index and update it as the new records are presented. Current time-series analysis and indexing
methods are insufficient, because the fixed features they rely on capture only limited periodicity in time-series
data and become brittle when the time-series encode heterogeneous temporal behaviors and are noisy and
incomplete. New indexing solutions must not only cluster the data, but also infer the meaningful
characteristics and present them to the users to improve their understanding of the data.
In this paper, we develop an indexing procedure based on typical latent behaviors within the time series. Our
method (1) converts the data to a quantized format, (2) learns identifying behaviors generating the data, and (3)
produces an index for the time series based on these behaviors. The method is found to outperform standard
approaches to time series indexing in terms of recall and precision for varying degrees of data noise.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
12
References
1
Citations
NaN
KQI