Pattern Discovery from Stock Time Series Using Self-Organizing Maps

2001 
† This work was supported by the RGC CERG project PolyU 5065/98E and the Departmental Grant H-ZJ84 ‡ Corresponding author ABSTRACT Pattern discovery from time series is of fundamental importance. Particularly when the domain expert derived patterns do not exist or are not complete, an algorithm to discover specific patterns or shapes automatically from the time series data is necessary. Such an algorithm is noteworthy in that it does not assume prior knowledge of the number of interesting structures, nor does it require an exhaustive explanation of the patterns being described. In this paper, a clustering approach is proposed for pattern discovery from time series. In view of its popularity and superior clustering performance, the self-organizing map (SOM) was adopted for pattern discovery in temporal data sequences. It is a special type of clustering algorithm that imposes a topological structure on the data. To prepare for the SOM algorithm, data sequences are segmented from the numerical time series using a continuous sliding window. Similar temporal patterns are then grouped together using SOM into clusters, which may subsequently be used to represent different structures of the data or temporal patterns. Attempts have been made to tackle the problem of representing patterns in a multi-resolution manner. With the increase in the number of data points in the patterns (the length of patterns), the time needed for the discovery process increases exponentially. To address this problem, we propose to compress the input patterns by a perceptually important point (PIP) identification algorithm. The idea is to replace the original data segment by its PIP’s so that the dimensionality of the input pattern can be reduced. Encouraging results are observed and reported for the application of the proposed methods to the time series collected from the Hong Kong stock market.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    109
    Citations
    NaN
    KQI
    []