logo
    Abstract:
    Traffic prediction is an important component of intelligent transportation system. Since traffic data is typical spatiotemporal data with spatial attributes and temporal attributes, how to integrate the information of temporal and spatial dimension to model traffic data and make effective prediction is an important way to improve the prediction effect. In terms of temporal modeling, most of the existing research uses RNN-based methods, which cannot effectively capture long-term sequence features. In terms of spatial modeling, the GCN model is used to model the static spatial structure, which cannot accurately reflect the dynamic relationship between the nodes in the graph structure, and in the multi-layer structure, the prediction error of each layer is easy to spread through the gradient to generate error accumulation. In view of the above deficiencies, we propose a traffic prediction model based on dynamic temporal graph convolutional networks. For temporal attribute modeling, dilated causal convolution is used to construct temporal relationships, and the influence of global temporal features on the extraction of temporal relationships is considered. For modeling the spatial relationship, a dynamic adjacency matrix is obtained by learning the relationship between the nodes in the graph through the attention mechanism, so that the model can capture the dynamic relationship between the nodes. At the same time, a Translate module is added between each spatiotemporal layer to reduce the propagation of prediction errors between spatiotemporal modules of each layer. The experimental results show that on the METR-LA dataset and the XIAN-TAXI dataset, compared with other mainstream traffic prediction methods, Our model achieves better prediction performance.
    Keywords:
    Adjacency matrix
    A brief introduction to the concept of data mining is given and three algorithms are analyzed and compared,including a decision tree algorithm which has the more detailed patterns for each item and allows continuous inputs but does not expand into bigger directories,an association rule algorithm which can be fast and expandable but very sensitive to the parameters,and a clustering algorithm which can group the data according to comparability but needs to set the complex parameters and variables.
    Comparability
    Data set
    Citations (0)
    Spatiotemporal database is essential to build effective spatiotemporal indices in order to improve query performance because of the huge mount of spatiotemporal data in the applications.The paper proposes and develops an index method based on 3D R-tree and R*-tree algorithm which shows its advantages in query efficiency comparing with 3D R-tree and HR-tree.
    Tree (set theory)
    R-tree
    Citations (0)
    In this paper we focus on index structure of distributed storage system for large-scale structured/semi-structured dataset. We propose a hybrid index structure based on clustering index and B+ tree, which defines the input, storage and query operation on the dataset, and analyze the communication model between index server and client. Because of different handling policy on record's primary keyword and other attributes, our architecture obtains good performance under different retrieval conditions.
    Inverted index
    Tree (set theory)
    Citations (0)
    Since many current IDSs(Intrusion Detection Systems) are constructed by manual encoding of expert knowledge,updating of them are very slow and expensive.It is obvious that the frequent patterns mined from audit data can be used as reliable intrusion detection models.Aiming at this problem,this paper proposes an efficient parallel method to extract an extensive set of features that describe each network connection and learn frequent patterns that accurately capture the behavior of intrusions and normal activities,which are employed to facilitate model construction and incremental updates simply and easily.
    Citations (0)
    Most of the algorithms of association rules are based on the Apriori algorithm.However,these algorithms will generate many useless itemsets and tedious rules,which will result in the loss of efficiency and validity.Additionally,the performance of interacting with operator of these methods is also of a sort.Thus,SQL-ⅡAR algorithm is developed in this paper.
    Operator (biology)
    Association (psychology)
    Citations (0)
    Neural Machine Translation (NMT) model has become the mainstream technology in machine translation. The supervised neural machine translation model trains with abundant of sentence-level parallel corpora. But for low-resources language or dialect with no such corpus available, it is difficult to achieve good performance. Researchers began to focus on unsupervised neural machine translation (UNMT) that monolingual corpus as training data. UNMT need to construct the language model (LM) which learns semantic information from the monolingual corpus. This paper focuses on the pre-training of LM in unsupervised machine translation and proposes a pre-training method, NER-MLM (named entity recognition masked language model). Through performing NER, the proposed method can obtain better semantic information and language model parameters with better training results. In the unsupervised machine translation task, the BLEU scores on the WMT’16 English–French, English–German, data sets are 35.30, 27.30 respectively. To the best of our knowledge, this is the highest results in the field of UNMT reported so far.
    Named Entity Recognition
    Citations (12)
    Process industry has entered a period of rapid development, and industrial safety is an eternal theme of industry 4.0. With the development of industry 4.0, the generation of a large amount of process data provides a research basis for process monitoring. The data-driven process monitoring algorithm sharply finds the abnormal value of the process by analyzing the data of normal condition. Different from the traditional alarm method and principal component analysis (PCA) contribution graph method, the Two-Dimensional Online Root-cause Contribution Graph (2D-ORCG) proposed in this paper is based on the traditional PCA method. Capturing information by the online root-cause variable visualization strategy proposed in this paper. Contribution analysis based on dynamic data can more effectively capture the key variables that affect the stability of the system, and also can accurately capture different variables at different times. Comparing to the traditional Two-Dimensional Contribution Graph method, the effectiveness of the 2D-ORCG method is demonstrated on the Tennessee Eastman Process (TEP).
    Root Cause Analysis
    Process industry
    Among the requirements elicitation activities, the stakeholder analysis is the main source of requirements. In this article, we propose a new model of data-driven stakeholder analysis, named SIG (Stakeholder Intention Graph), a semantic extension of property graph model that can represent the stakeholders' intentions and their relationships. To elicit the stakeholders' intentions from the speech data during meetings, we developed a system of structural analysis and SIG generation method from speech data. Based on the graph theory, we also propose an analysis methodology of stakeholders' intentions and their structure with both global and local graph analyses. We implemented a speech data-driven stakeholder analysis system on the graph database Neo4j. As the output, the analysis system automatically generates the stakeholder matrix from the speech data at the meetings. We applied the analysis method and system to the speech data of actual development meetings on the public service systems, and demonstrated the effectiveness of the proposed method.
    Stakeholder Analysis
    The objective of privacy preserving data mining(PPDM) is to find a way to manipulate the dataset,so that the sensitive message can't be disclosed in data mining.There are lots of algorithms proposed in recent years,they can be classified as heuristic-based techniques,seure multiparty techniques and reconstruction-based techniques by the privacy preserving technique.It gave an overview of PPDM in terms of the classification above,and presented some evaluation standards.Furthermore,it showed the future work of association rules hiding.
    Citations (3)