The result of U-Topk query simply consist of k tuples, which is not satisfactory in many cases mainly for the following two reasons: firstly, the probability of result is so small that it is hard for users to accept it, secondly, it abandons the relations between the tuples and the corresponding entities, accordingly it can't completely reflect the real state of monitored entities. Aiming at shortage of tuple-oriented semantic of U-Topk query, this paper proposes an entity-oriented U-Topk query named as EoU-Topk as well as query processing algorithm. The basic idea of the algorithm is converting tuple-oriented probabilistic database into entity-oriented probabilistic database. In this process, some exclusive tuples that meet the pre-defined rules will be merged. The algorithm of EoU-Topk query has two advantage: firstly, it can greatly reduce the size of probabilistic database, secondly, it can truly reflect whole state of entities, and avoid the one sidedness of the tuple-oriented U-Topk query. Finally, the efficiency and quality of the EoU-Topk query proposed in the paper are verified by experiments using real data.
Focusing on some high critical application,the conceptions of deadline,slack and critica in real-time system are introduced and improved in DSMS according to the characteristics of continuous query in DSMS.According to these new conceptions,a real-time schedule strategy based on priority is proposed.In this scheduling strategy,the earlier the deadline is or the shorter the slack is or the more critical the query is,the higher the priority is.And a structure of priority tree is proposed to realize the unique execution sequence of priority-based query.The experimental results indicate that the strategy raises the hit value ratio(HVR)and the success ratio of continuous query scheduling.
To eliminate the gap between the high-level semantics and the low-level features of images,a new method based on co-occurrence matrix is proposed to extract image texture feature.Combining the image's statistical features in the frequency domain with its spatial distribution attributes,the method extracts the local frequency information on image by wavelet transforming.Then the image's global structural characteristics are integrated with its wavelet properties to construct the wavelet-gray co-occurrence matrix so as to extract the image's texture features for the retrieval of medical images.The results of comparative test showed that the wavelet-gray co-occurrence matrix is superior in medical image retrieval in comparison to other ways where the co-occurrence matrix is separated from wavelet features in applications.
For the data stream characteristics of continuous, unbounded and real-time, continuous queries over data streams are generally based on windows. Windows placed on data streams are maintained by operators themselves for a query in most DSMSs. However, some operators can't maintain the windows properly, and a lot of redundancy and inconsistency may be incurred because the tuples are heavily copied and operators interfere with each other. In this paper, we thoroughly discuss the query processing mechanism based on windows and analyze the window semantics in a query sentence, then propose a query processing approach MullayerQuery that extract stream window and operator windows from the query window. A strategy and some algorithms are given to keep the consistency among windows in a query. The experiments show that MullayerQuery runs effectively and efficiently.
A hash tag is an important metadata in micro blogs and used to mark topics or index messages. However, statistics show hash tags are absent from most of the micro blogs. It poses great challenges to the retrieve and analysis of these tagless micro blogs. In this paper, we summarize the similarity between micro blogs and short message news, and then propose an algorithm named 5WTAG for detecting micro blog topics based on 5W (When, Where, Who, What, how) model. Since 5W attributes are the core components in event description, it is guaranteed theoretically that 5WTAG can extract the semantics of the micro blogs properly. We introduce the detailed procedure of 5WTAG in this paper including the candidate hash tag construction and recommendation computation. Finally, we verify the semantical correctness of the candidate hash tags as well as the effectiveness of recommendation computation using the real data set from Sina Weibo.
To satisfy the multi-level requirements of information process and supply the up to the second and exact global data view for decision-maker and analyzer in enterprise,the three-leveled architecture DB-ODS-DW in which ODS takes a connecting link functions between the preceding and the following is come forth.The data update strategy,an essential technique of ODS is studied in this paper.Contrasting with the traditional ODS data update strategies,a new strategy based on WAN and heterogeneous environment is proposed,in which the XML is used to implement the delta data updating in heterogeneous databases and to transmit the delta data file efficiently.This kind of strategy has been applied in practice and perfect effect has been obtained.
Clustered routing protocols in wireless sensor networks(WSN) provide significant advantages for energy saving and data reprocessing.However,the reliability and security of the communications abiding by those protocols greatly depend on the cluster heads in which the severe security problems will possibly happen.So, a clustered routing protocol with distributed intrusion detection(DID) is proposed,i.e.,all the nodes in WSN are involved in order for the preset of cipher keys to identify/detect the intrusive cluster heads in a distributed way.Experimental results demonstrated that compared with the classic cluster-based routing protocols in WSN,the energy consumption by the protocol with DID is quite less and presents a linear relationship with the tolerance of intrusive nodes.