Efficient Longest Streak Discovery in Multidimensional Sequence Data

Wentao Wang,Bo Tang,Min Zhu

Efficient Longest Streak Discovery in Multidimensional Sequence Data

2018

Wentao Wang
Bo Tang
Min Zhu

This paper studies the problem of discovering longest streak in multidimensional sequence dataset. Given a multidimensional sequence dataset, the contextual longest streak is the longest consecutive tuples in a context subspace which match with a specific measure constraint. It has various applications in social network analysis, computational journalism, etc. The challenges of the longest streak discovery problem are (i) huge search space, and (ii) non-monotonicity property of streak lengths. In this paper, we propose a novel computation framework with a suite of optimization techniques for it. Our solutions outperform the baseline solution by two orders of magnitude in both real and synthetic datasets. In addition, we validate the effectiveness of our proposal by a real-world case study.

Keywords:

Machine learning
Social network analysis
Computer science
Computation
Artificial intelligence
Computational journalism
Order of magnitude
Streak
Subspace topology
Tuple
Algorithm
data sequences
Data mining
Suite

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations