Caiyan Dai

Nanjing University of Chinese Medicine

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Kongfa Hu

Nanjing University of Chinese Medicine

Ling Chen

Nanjing Drum Tower Hospital

Youwei Ding

Nanjing University of Chinese Medicine

Keming Tang

Shenzhen University

Ju He

Wenzhou Medical University

DING Youwei

Nanjing University of Chinese Medicine

Hao Qian

Zhejiang Ocean University

Lin Han

Lanzhou University

Wei Guo

Pingdingshan University

Zhiming Gao

Oak Ridge National Laboratory

Cooperative Institutions

Park University

673

Chinese Academy of Sciences

Central South University

University of Chinese Academy of Sciences

Weatherford College

Zhejiang University

Peking University

Huazhong University of Science and Technology

Shanghai Jiao Tong University

Sun Yat-sen University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

An Algorithm for Mining Frequent Closed Itemsets in Data Stream

Physics Procedia (2012)

Caiyan Dai Ling Chen

Abstract Mining frequent itemsets from data streams by the model of sliding window has been extensively studied. This paper presents an algorithm AFPCFI-DS for mining the frequent itemsets from data streams. The algorithm detects the frequent items using a FP-tree in each sliding window. In processing each new window the algorithm first changes the head table and then modifies the FP-tree according to the changed items in the head table. The algorithm also adopts local updating strategy to avoid the time-consuming operations of searching in the whole tree to add or delete transactions. Our experimental results show that the algorithm is more efficient and has lower time and memory complexity than the algorithms Moment and FPCFI-DS.

10.1016/j.phpro.2012.02.254

Cite

Citations (3)

An algorithm for mining frequent closed itemsets with density from data streams

International Journal of Computational Science and Engineering (2016)

Caiyan Dai Ling Chen

Mining frequent closed itemsets from data streams is an important topic. In this paper, we propose an algorithm for mining frequent closed itemsets from data streams based on a time fading module. By dynamically constructing a pattern tree, the algorithm calculates densities of the itemsets in the pattern tree using a fading factor. The algorithm deletes real infrequent itemsets from the pattern tree so as to reduce the memory cost. A density threshold function is designed in order to identify the real infrequent itemsets which should be deleted. Using such density threshold function, deleting the infrequent itemsets will not affect the result of frequent itemset detecting. The algorithm modifies the pattern tree and detects the frequent closed itemsets in a fixed time interval so as to reduce the computation time. We also analyse the error caused by deleting the infrequent itemsets. The experimental results indicate that our algorithm can get higher accuracy results, and needs less memory and computation time than other algorithm.

Tree (set theory)

10.1504/ijcse.2016.076217

Cite

Citations (5)

Research on Text Mining Technology for Power Grid Safety Hazard Identification

Yusong Guo Caiyan Dai Junliang Wang Jie Feng

Text data mining and analysis for power grid safety hidden danger files, and in-depth study of power grid hidden danger investigation standards and norms, can help power grid enterprises to carry out hidden danger management efficiently and conveniently. Firstly, the study explains the content of the hidden problems of power grid enterprises and the direction of the study. It also summarizes the general process and methods of text mining technology, and explores the current research status of text mining technology in grid hidden danger investigation; Secondly, we observe the textual characteristics of the grid hidden danger investigation, get the visualization results as well as the main manifestations of the grid hidden danger by text mining the existing 412 grid hidden danger texts. And we finally explore the difficulties of the grid hidden danger investigation text mining as well as the possible development direction in the future.

Identification

Power grid

10.1109/iceace60673.2023.10442208

Cite

Citations (0)

Identifying essential proteins in dynamic protein networks based on an improved h-index algorithm

Research Square (Research Square) (2020)

Caiyan Dai Ju He Kongfa Hu DING Youwei

Abstract Background : The essential proteins in protein networks play an important role in complex cellular functions and in protein evolution. Therefore, the identification of essential proteins in a network can help to explain the structure, function, and dynamics of basic cellular networks. The existing dynamic protein networks regard the protein components as the same at all time points; however, the role of proteins can vary over time. Results: To improve the accuracy of identifying essential proteins, an improved h -index algorithm based on the attenuation coefficient method is proposed in this paper. This method incorporates previously neglected node information to improve the accuracy of the essential protein search. It can ensure the accuracy of the found proteins while identifying more essential proteins. Conclusions: The described experiments show that this method is more effective than other similar methods in identifying essential proteins in dynamic protein networks. This study can better explain the mechanism of life activities and provide theoretical basis for the research and development of targeted drugs.

Identification

10.21203/rs.2.16891/v2

Cite

Citations (1)

A Novel Strategy for Mining Frequent Closed Itemsets in Data Streams

Journal of Computers (2012)

Keming Tang Caiyan Dai Ling Chen

Mining frequent itemsets from data stream is an important task in stream data mining. This paper presents an algorithm Stream_FCI for mining the frequent closed itemsets from data streams in the model of sliding window. The algorithm detects the frequent closed itemsets in each sliding window using a DFP-tree with a head table. In processing each new transaction, the algorithm changes the head table and modifies the DFP-tree according to the changed items in the head table. The algorithm also adopts a table to store the frequent closed itemsets so as to avoid the time-consuming operations of searching in the whole DFP-tree for adding or deleting transactions. Our experimental results show that our algorithm is more efficient and has lower time and memory complexity than the similar algorithms Moment and FPCFI-DS.

10.4304/jcp.7.7.1564-1573

Cite

Citations (12)

Research on Question Answering of Lung Cancer Based on Knowledge Graph

2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2022)

Xiangxiang Shao Caiyan Dai Kongfa Hu

With the influence of various comprehensive factors such as people's diet and living habits and the external environment, lung cancer has become a high-incidence and high-risk disease in the world. In order to prevent and treat lung cancer, a lung cancer question and answer system based on knowledge graph is built to provide intelligent auxiliary diagnosis and treatment, which can quickly and accurately answer the questions raised by patients. In this article, the lung cancer medical case data is used to construct the knowledge graph of lung cancer, and the problem data set is constructed from the template and enhanced to increase the data volume and diversity of the data. Then build a multi-task learning model, and perform question sentence intent recognition and question sentence entity recognition at the same time, which can learn the relationship between tasks, improve the effect of the two types of tasks, and shorten the training and inference time, which can effectively reduce training cost. Finally, by analyzing the questions through the model, the entity and relationship are obtained, and the answer is obtained by querying from the knowledge graph.

Knowledge graph

10.1109/bibm55620.2022.9995312

Cite

Citations (1)

Author Index

2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2021)

Oliver Aalami Ahmed Abbasi Hanin Abdulrahman Dilanga Abeyrathna Rashmie Abeysinghe

10.1109/bibm52615.2021.9669315

Cite

Citations (0)

An Efficient Algorithm for top-k Queries on Uncertain Data Streams

Caiyan Dai Ling Chen Yixin Chen Keming Tang

We tackle the problem of answering maximum probabilistic top-k tuple set queries. We use a sliding-window model on uncertain data streams and present an efficient algorithm for processing sliding-window queries on uncertain streams. In each sliding window, the algorithm selects the k tuples with the highest probabilities from sets of different numbers of the tuples with the highest scores. Then, the algorithm computes existential probability of the top-k tuples, and chooses the set with the highest probability as the top-k query result. We theoretically prove the correctness of the algorithm. Our experimental results show that our algorithm requires lower time and space complexity than other existing algorithms.

Sliding window protocol

Uncertain Data

10.1109/icmla.2012.57

Cite

Citations (2)

A random walk-based method for detecting essential proteins by integrating the topological and biological features of PPI network

Soft Computing (2021)

Nahla Mohamed Ahmed Ling Chen Bin Li Wei Liu Caiyan Dai

Biological data

10.1007/s00500-021-05780-8

Cite

Citations (8)

改良h指数アルゴリズムに基づく動的蛋白質ネットワークにおける必須蛋白質の同定【JST・京大機械翻訳】

Caiyan Dai He Ju Kongfa Hu Youwei Ding

Source

Cite

Citations (0)