Qi Li

Auburn University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Xiangfeng Meng

Shandong University

Huazheng Wu

Shandong University

Tianyi Wang

Shandong Normal University

Wei-Shinn Ku

National Central University

Shoupei Liu

Shandong University

Yongkai Yin

Shanxi University

Hsiang-Wei Huang

National Central University

Harry H. Cheng

Shandong University

Liqiang Nie

Harbin Institute of Technology

Tzu-Chen Chiu

National Central University

Cooperative Institutions

Shandong University

University of California, Davis

University of Science and Technology of China

National Central University

Google (United States)

Sun Yat-sen University

South China University of Technology

Cornell University

Tianjin University of Technology

National University of Singapore

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Mobile Robot Path Planning Based on Improved Ant Colony Fusion Dynamic Window Approach

2022 IEEE International Conference on Mechatronics and Automation (ICMA) (2021)

Lei Shao Qi Li Chao Li Wentao Sun

Aiming at the shortcomings of ant colony algorithm in the complex environment, such as being easy to fall into local optimum and difficult to guarantee real-time path planning of robots, this paper proposes a dynamic window algorithm based on improved ant colony (IACO-DWA). In order to avoid the blind search of ants in the early stage, this method designs an adaptive distance induction factor, and combines the maximum and minimum ant system (MMAS) to improve the pheromone update rule to prevent falling into the local optimum; to improve the probability transfer rule by constructing a corner suppression factor, Reduce the path inflection points, and integrate the global path points generated by the DWA tracking ant colony to construct a new position evaluation function, and then plan a smooth path trajectory. The simulation results show that the method in this paper strengthens the optimization performance of the global path while realizing the local dynamic obstacle avoidance.

Obstacle avoidance

Ant colony

10.1109/icma52036.2021.9512795

Cite

Citations (7)

Cryptographic analysis on an optical random-phase-encoding cryptosystem for complex targets based on physics-informed learning

Optics Express (2021)

Huazheng Wu Qi Li Xiangfeng Meng Xiulun Yang Shoupei Liu

Optical cryptanalysis based on deep learning (DL) has grabbed more and more attention. However, most DL methods are purely data-driven methods, lacking relevant physical priors, resulting in generalization capabilities restrained and limiting practical applications. In this paper, we demonstrate that the double-random phase encoding (DRPE)-based optical cryptosystems are susceptible to preprocessing ciphertext-only attack (pCOA) based on DL strategies, which can achieve high prediction fidelity for complex targets by using only one random phase mask (RPM) for training. After preprocessing the ciphertext information to procure substantial intrinsic information, the physical knowledge DL method based on physical priors is exploited to further learn the statistical invariants in different ciphertexts. As a result, the generalization ability has been significantly improved by increasing the number of training RPMs. This method also breaks the image size limitation of the traditional COA method. Optical experiments demonstrate the feasibility and the effectiveness of the proposed learning-based pCOA method.

10.1364/oe.441293

Cite

Citations (7)

MIVC: Multiple Instance Visual Component for Visual-Language Models

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024)

Wenyi Wu Qi Li Wenliang Zhong Junzhou Huang

Vision-language models have been widely explored across a wide range of tasks and achieve satisfactory performance. However, it's under-explored how to consolidate entity understanding through a varying number of images and to align it with the pre-trained language models for generative tasks. In this paper, we propose MIVC, a general multiple instance visual component to bridge the gap between various image inputs with off-the-shelf vision-language models by aggregating visual representations in a permutation-invariant fashion through a neural network. We show that MIVC could be plugged into the visual-language models to improve the model performance consistently on visual question answering, classification and captioning tasks on a public available e-commerce dataset with multiple images per product. Furthermore, we show that the component provides insight into the contribution of each image to the downstream tasks.

Component (thermodynamics)

Visual Language

Interactive visual analysis

10.1109/wacv57701.2024.00793

Cite

Citations (0)

VideoBadminton: A Video Dataset for Badminton Action Recognition

2021 IEEE International Conference on Big Data (Big Data) (2024)

Qi Li Tzu-Chen Chiu Hsiang-Wei Huang Min-Te Sun Wei-Shinn Ku

Action Recognition

10.1109/bigdata62323.2024.10825009

Cite

Citations (0)

Voice-Face Homogeneity Tells Deepfake

arXiv (Cornell University) (2022)

Harry H. Cheng Y. Guo Tianyi Wang Qi Li Tao Ye

Detecting forgery videos is highly desirable due to the abuse of deepfake. Existing detection approaches contribute to exploring the specific artifacts in deepfake videos and fit well on certain data. However, the growing technique on these artifacts keeps challenging the robustness of traditional deepfake detectors. As a result, the development of generalizability of these approaches has reached a blockage. To address this issue, given the empirical results that the identities behind voices and faces are often mismatched in deepfake videos, and the voices and faces have homogeneity to some extent, in this paper, we propose to perform the deepfake detection from an unexplored voice-face matching view. To this end, a voice-face matching method is devised to measure the matching degree of these two. Nevertheless, training on specific deepfake datasets makes the model overfit certain traits of deepfake algorithms. We instead, advocate a method that quickly adapts to untapped forgery, with a pre-training then fine-tuning paradigm. Specifically, we first pre-train the model on a generic audio-visual dataset, followed by the fine-tuning on downstream deepfake data. We conduct extensive experiments over three widely exploited deepfake datasets - DFDC, FakeAVCeleb, and DeepfakeTIMIT. Our method obtains significant performance gains as compared to other state-of-the-art competitors. It is also worth noting that our method already achieves competitive results when fine-tuned on limited deepfake data.

Overfitting

Robustness

Offensive

10.48550/arxiv.2203.02195

Cite

Citations (0)

QA-Driven Zero-shot Slot Filling with Weak Supervision Pretraining

Xinya Du Luheng He Qi Li Dian Yu Panupong Pasupat

Xinya Du, Luheng He, Qi Li, Dian Yu, Panupong Pasupat, Yuan Zhang. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2021.

Zhàng

Zero (linguistics)

10.18653/v1/2021.acl-short.83

Cite

Citations (16)

Voice-Face Homogeneity Tells Deepfake

ACM Transactions on Multimedia Computing Communications and Applications (2023)

Harry H. Cheng Yangyang Guo Tianyi Wang Qi Li Xiaojun Chang

Detecting forgery videos is highly desirable due to the abuse of deepfake. Existing detection approaches contribute to exploring the specific artifacts in deepfake videos and fit well on certain data. However, the growing technique on these artifacts keeps challenging the robustness of traditional deepfake detectors. As a result, the development of these approaches has reached a blockage. In this article, we propose to perform deepfake detection from an unexplored voice-face matching view. Our approach is founded on two supporting points: first, there is a high degree of homogeneity between the voice and face of an individual (i.e., they are highly correlated), and second, deepfake videos often involve mismatched identities between the voice and face due to face-swapping techniques. To this end, we develop a voice-face matching method that measures the matching degree between these two modalities to identify deepfake videos. Nevertheless, training on specific deepfake datasets makes the model overfit certain traits of deepfake algorithms. We instead advocate a method that quickly adapts to untapped forgery, with a pre-training then fine-tuning paradigm. Specifically, we first pre-train the model on a generic audio-visual dataset, followed by the fine-tuning on downstream deepfake data. We conduct extensive experiments over three widely exploited deepfake datasets: DFDC, FakeAVCeleb, and DeepfakeTIMIT. Our method obtains significant performance gains as compared to other state-of-the-art competitors. For instance, our method outperforms the baselines by nearly 2%, achieving an AUC of 86.11% on FakeAVCeleb. It is also worth noting that our method already achieves competitive results when fine-tuned on limited deepfake data.

Overfitting

Robustness

Modalities

10.1145/3625231

Cite

Citations (38)

Color computational ghost imaging based on a plug-and-play generalized alternating projection

Optics Express (2022)

Shoupei Liu Qi Li Huazheng Wu Xiangfeng Meng

Computational ghost imaging (CGI), in which an image is retrieved from the known speckle patterns that illuminate the object and the total transmitted intensity, has shown great advances because of its advantages and potential applications at all wavelengths. However, high-quality and less time-consuming imaging has been proven challenging especially in color CGI. In this paper, we will present a new color CGI method that can achieve the reconstruction of high-fidelity images at a relatively low sampling rate (0.0625) by using plug-and-play generalized alternating projection algorithm (PnP-GAP). The spatial distribution and color information of the object are encoded into a one-dimensional light intensity sequence simultaneously by combining randomly distributed speckle patterns and a Bayer color mask as modulation patterns, which is measured by a single-pixel detector. A pre-trained deep denoising network is utilized in the PnP-GAP algorithm to achieve better results. Furthermore, a joint reconstruction and demosaicking method is developed to restore the target color information more realistically. Simulations and optical experiments are performed to verify the feasibility and superiority of our proposed scheme by comparing it with other classical reconstruction algorithms. This new color CGI scheme will enable CGI to obtain information in real scenes more effectively and further promote its practical applications.

Speckle noise

Ghost Imaging

10.1364/oe.459488

Cite

Citations (10)

Optimization of Subgraph Matching over Knowledge Graph Based on Subgraph Indexing

Lei Lv Jiayu Liu Qi Li Jiazhou Li

Given a query graph, subgraph matching is the process of finding all the isomorphic graphs over a large data graph. Subgraph is one of the fundamental steps of many graph-based applications including recommendation system, information retrieval, social network analysis, etc. In this paper, we investigate the problem of subgraph matching over power grid knowledge graph. Since knowledge graph is a modelled as a directed, labelled, and multiple edges graph, it brings new challenges for the subgraph matching on knowledge graph. One challenge is that subgraph matching candidate calculation complexity increases with edges increase. Another challenge is that the search space of isomorphic subgraphs for a given region is huge, which needs more system resources to prune the unpromising graph candidates. To address these challenges, we propose subgraph index to accelerate the matching processing of subgraph que-ry. We use domain-specific information to construct index of power grid knowledge and maintain a small portion of search candidates in the search space. Experimental studies on real knowledge graph and synthetic graphs demonstrate that the proposed techniques are efficient compared with counterparts.

Subgraph isomorphism problem

Factor-critical graph

Graph factorization

Distance-hereditary graph

Block graph

Null graph

10.1109/icaibd55127.2022.9820592

Cite

Citations (1)

Super-coding resolution single-pixel imaging based on unpaired data-driven deep learning

Optics and Lasers in Engineering (2023)

Shoupei Liu Huazheng Wu Qi Li Xiangfeng Meng Yongkai Yin

Sub-pixel resolution

10.1016/j.optlaseng.2023.107786

Cite

Citations (2)