Dawei Zhou

Virginia Tech

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

TrustLOG: The First Workshop on Trustworthy Learning on Graphs

Proceedings of the 31st ACM International Conference on Information & Knowledge Management (2022)

Jian Kang Shuaicheng Zhang Бо Ли Jingrui He Jian Pei

Learning on graphs (LOG) plays a pivotal role in various high-impact application domains. The past decades have developed tremendous theories, algorithms, and open-source systems in answering what/who questions on graphs. However, recent studies reveal that the state-of-the-art techniques for learning on graphs (LOG) are often not trustworthy in practice with respect to several social aspects (e.g., fairness, transparency, security). A natural research question to ask is: how can we make learning algorithms on graphs trustworthy? To answer this question, we propose a paradigm shift, from answering what and who LOG questions to understanding how and why LOG questions. The TrustLOG workshop provides a venue for presenting, discussing, and promoting frontier research on trustworthy learning on graphs. Moreover, TrustLOG will serve as an impulse for the LOG community to identify novel research problems and shed new light on future directions.

Trustworthiness

10.1145/3511808.3557497

Cite

Citations (1)

Discovering rare categories from graph streams

Data Mining and Knowledge Discovery (2016)

Dawei Zhou Arun Kumar Karthikeyan Kangyang Wang Nan Cao Jingrui He

Streaming Data

10.1007/s10618-016-0478-6

Cite

Citations (16)

UnifiedGT: Towards a Universal Framework of Transformers in Large-Scale Graph Learning

2021 IEEE International Conference on Big Data (Big Data) (2024)

Junhong Lin Xiaojie Guo Shuaicheng Zhang Dawei Zhou Yada Zhu

10.1109/bigdata62323.2024.10825327

Cite

Citations (0)

Bi-Level Rare Temporal Pattern Detection

IEEE Conference Proceedings (2016)

Dawei Zhou Jingrui He Yu Cao Seo Jae-Sun

Rare events

Pattern detection

Source

Cite

Citations (0)

Fairness-Aware Clique-Preserving Spectral Clustering of Temporal Graphs

Proceedings of the ACM Web Conference 2022 (2023)

Dongqi Fu Dawei Zhou Ross Maciejewski Arie Croitoru Marcus A. Boyd

With the widespread development of algorithmic fairness, there has been a surge of research interest that aims to generalize the fairness notions from the attributed data to the relational data (graphs). The vast majority of existing work considers the fairness measure in terms of the low-order connectivity patterns (e.g., edges), while overlooking the higher-order patterns (e.g., k-cliques) and the dynamic nature of real-world graphs. For example, preserving triangles from graph cuts during clustering is the key to detecting compact communities; however, if the clustering algorithm only pays attention to triangle-based compactness, then the returned communities lose the fairness guarantee for each group in the graph. Furthermore, in practice, when the graph (e.g., social networks) topology constantly changes over time, one natural question is how can we ensure the compactness and demographic parity at each timestamp efficiently. To address these problems, we start from the static setting and propose a spectral method that preserves clique connections and incorporates demographic fairness constraints in returned clusters at the same time. To make this static method fit for the dynamic setting, we propose two core techniques, Laplacian Update via Edge Filtering and Searching and Eigen-Pairs Update with Singularity Avoided. Finally, all proposed components are combined into an end-to-end clustering framework named F-SEGA, and we conduct extensive experiments to demonstrate the effectiveness, efficiency, and robustness of F-SEGA.

Robustness

Clique

10.1145/3543507.3583423

Cite

Citations (6)

Augmenting Knowledge Transfer across Graphs

2021 IEEE International Conference on Data Mining (ICDM) (2022)

Yuzhen Mao Jianhui Sun Dawei Zhou

Given a resource-rich source graph and a resource-scarce target graph, how can we effectively transfer knowledge across graphs and ensure a good generalization performance? In many high-impact domains (e.g., brain networks and molecular graphs), collecting and annotating data is prohibitively expensive and time-consuming, which makes domain adaptation an attractive option to alleviate the label scarcity issue. In light of this, the state-of-the-art methods focus on deriving domain-invariant graph representation that minimizes the domain discrepancy. However, it has recently been shown that a small domain discrepancy loss may not always guarantee a good generalization performance, especially in the presence of disparate graph structures and label distribution shifts. In this paper, we present TRANSNET, a generic learning framework for augmenting knowledge transfer across graphs. In particular, we introduce a novel notion named trinity signal that can naturally formulate various graph signals at different granularity (e.g., node attributes, edges, and subgraphs). With that, we further propose a domain unification module together with a trinity-signal mixup scheme to jointly minimize the domain discrepancy and augment the knowledge transfer across graphs. Finally, comprehensive empirical results show that TRANSNET outperforms all existing approaches on seven benchmark datasets by a significant margin.

Knowledge Transfer

Domain Adaptation

10.1109/icdm54844.2022.00138

Cite

Citations (0)

A systematic evaluation of computational methods for cell segmentation

Briefings in Bioinformatics (2024)

Yuxing Wang Junhan Zhao Hongye Xu Cheng Han Zhiqiang Tao

Cell segmentation is a fundamental task in analyzing biomedical images. Many computational methods have been developed for cell segmentation and instance segmentation, but their performances are not well understood in various scenarios. We systematically evaluated the performance of 18 segmentation methods to perform cell nuclei and whole cell segmentation using light microscopy and fluorescence staining images. We found that general-purpose methods incorporating the attention mechanism exhibit the best overall performance. We identified various factors influencing segmentation performances, including image channels, choice of training data, and cell morphology, and evaluated the generalizability of methods across image modalities. We also provide guidelines for choosing the optimal segmentation methods in various real application scenarios. We developed Seggal, an online resource for downloading segmentation models already pre-trained with various tissue and cell types, substantially reducing the time and effort for training cell segmentation models.

Segmentation-based object categorization

10.1093/bib/bbae407

Cite

Citations (2)

M entor GNN

Proceedings of the 31st ACM International Conference on Information & Knowledge Management (2022)

Dawei Zhou Lecheng Zheng Dongqi Fu Jiawei Han Jingrui He

Graph pre-training strategies have been attracting a surge of attention in the graph mining community, due to their flexibility in parameterizing graph neural networks (GNNs) without any label information. The key idea lies in encoding valuable information into the backbone GNNs, by predicting the masked graph signals extracted from the input graphs. In order to balance the importance of diverse graph signals (e.g., nodes, edges, subgraphs), the existing approaches are mostly hand-engineered by introducing hyperparameters to re-weight the importance of graph signals. However, human interventions with sub-optimal hyperparameters often inject additional bias and deteriorate the generalization performance in the downstream applications. This paper addresses these limitations from a new perspective, i.e., deriving curriculum for pre-training GNNs. We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs with diverse structures and disparate feature spaces. To comprehend heterogeneous graph signals at different granularities, we propose a curriculum learning paradigm that automatically re-weighs graph signals in order to ensure a good generalization in the target domain. Moreover, we shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs. Extensive experiments on a wealth of real graphs validate and verify the performance of MentorGNN.

Hyperparameter

10.1145/3511808.3557393

Cite

Citations (5)

Graph of Logic: Enhancing LLM Reasoning with Graphs and Symbolic Logic

2021 IEEE International Conference on Big Data (Big Data) (2024)

Fatimah Alotaibi Adithya Kulkarni Dawei Zhou

10.1109/bigdata62323.2024.10825450

Cite

Citations (0)

Fairgen: Towards Fair Graph Generation

2022 IEEE 38th International Conference on Data Engineering (ICDE) (2024)

Lecheng Zheng Dawei Zhou Hanghang Tong Jiejun Xu Yada Zhu

10.1109/icde60146.2024.00181

Cite

Citations (0)