Jiawei Liu

Zhejiang University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Xiaozhong Liu

Worcester Polytechnic Institute

Yangyang Kang

Alibaba Group (China)

Wei Lu

Wuhan University

Changlong Sun

Alibaba Group (China)

Zhuoren Jiang

Zhejiang University

Qikai Cheng

Wuhan University

Yongqiang Ma

GCI Science & Technology (China)

Guoxiu He

East China Normal University

Wei Lü

Sun Yat-sen University

Xiang Shi

Wuhan University

Cooperative Institutions

Wuhan University

Chinese Academy of Sciences

Zhejiang University

Chinese Academy of Medical Sciences & Peking Union Medical College

Sun Yat-sen University

University of Chinese Academy of Sciences

Air Force Medical University

Huazhong University of Science and Technology

Ministry of Agriculture and Rural Affairs

Sichuan Agricultural University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Know Where to Go: Make LLM a Relevant, Responsible, and Trustworthy Searcher

arXiv (Cornell University) (2023)

Xiang Shi Jiawei Liu Yinpeng Liu Qikai Cheng Wei Lü

The advent of Large Language Models (LLMs) has shown the potential to improve relevance and provide direct answers in web searches. However, challenges arise in validating the reliability of generated results and the credibility of contributing sources, due to the limitations of traditional information retrieval algorithms and the LLM hallucination problem. Aiming to create a "PageRank" for the LLM era, we strive to transform LLM into a relevant, responsible, and trustworthy searcher. We propose a novel generative retrieval framework leveraging the knowledge of LLMs to foster a direct link between queries and online sources. This framework consists of three core modules: Generator, Validator, and Optimizer, each focusing on generating trustworthy online sources, verifying source reliability, and refining unreliable sources, respectively. Extensive experiments and evaluations highlight our method's superior relevance, responsibility, and trustfulness against various SOTA methods.

Validator

Relevance

Trustworthiness

10.48550/arxiv.2310.12443

Cite

Citations (0)

AI vs. Human -- Differentiation Analysis of Scientific Content Generation

arXiv (Cornell University) (2023)

Yongqiang Ma Jiawei Liu Yi Fan

Recent neural language models have taken a significant step forward in producing remarkably controllable, fluent, and grammatical text. Although studies have found that AI-generated text is not distinguishable from human-written text for crowd-sourcing workers, there still exist errors in AI-generated text which are even subtler and harder to spot. We primarily focus on the scenario in which scientific AI writing assistant is deeply involved. First, we construct a feature description framework to distinguish between AI-generated text and human-written text from syntax, semantics, and pragmatics based on the human evaluation. Then we utilize the features, i.e., writing style, coherence, consistency, and argument logistics, from the proposed framework to analyze two types of content. Finally, we adopt several publicly available methods to investigate the gap of between AI-generated scientific text and human-written scientific text by AI-generated scientific text detection models. The results suggest that while AI has the potential to generate scientific content that is as accurate as human-written content, there is still a gap in terms of depth and overall quality. The AI-generated scientific content is more likely to contain errors in factual issues. We find that there exists a "writing style" gap between AI-generated scientific text and human-written scientific text. Based on the analysis result, we summarize a series of model-agnostic and distribution-agnostic features for detection tasks in other domains. Findings in this paper contribute to guiding the optimization of AI models to produce high-quality content and addressing related ethical and security concerns.

Scientific Writing

Argument (complex analysis)

Content (measure theory)

10.48550/arxiv.2301.10416

Cite

Citations (49)

MetaA: Multi-Dimensional Evaluation of Testing Ability via Adversarial Examples in Deep Learning

Siqi Gu Jiawei Liu Zhanwei Hui Wenhong Liu Zhenyu Chen

Deep learning (DL) has shown superior performance in many areas, making the quality assurance of DL-based software particularly important. Adversarial examples are generated by deliberately adding subtle perturbations in input samples and can easily attack less reliable DL models. Most existing works only utilize a single metric to evaluate the generated adversarial examples, such as attacking success rate or structure similarity measure. The problem is that they cannot avoid extreme testing situations and provide multifaceted evaluation results.This paper presents MetaA, a multi-dimensional evaluation framework for testing ability of adversarial examples in deep learning. Evaluating the testing ability represents measuring the testing performance to make improvements. Specifically, MetaA performs comprehensive validation on generating adversarial examples from two horizontal and five vertical dimensions. We design MetaA according to the definition of the adversarial examples and the issue mentioned in [1] that how to enrich the evaluation dimension rather than merely quantifying the improvement of DL and software.We conduct several analyses and comparative experiments vertically and horizontally to evaluate the reliability and effectiveness of MetaA. The experimental results show that MetaA can avoid speculation and reach agreement among different indicators when they reflect inconsistencies. The detailed and comprehensive analysis of evaluation results can further guide the optimization of adversarial examples and the quality assurance of DL-based software.

Similarity (geometry)

10.1109/qrs57517.2022.00104

Cite

Citations (1)

RPTQ: Reorder-based Post-training Quantization for Large Language Models

arXiv (Cornell University) (2023)

Zhihang Yuan Lin Niu Jiawei Liu Wenyu Liu Xinggang Wang

Large-scale language models (LLMs) have demonstrated impressive performance, but their deployment presents challenges due to their significant memory usage. This issue can be alleviated through quantization. In this paper, we identify that the challenge in quantizing activations in LLMs arises from varying ranges across channels, rather than solely the presence of outliers. To address this challenge, we introduce a quantization method called RPTQ, which utilizes a reorder-based approach. By rearranging the channels and quantizing them in clusters, RPTQ effectively mitigates the impact of range differences between channels. To minimize the overhead of the reorder operation, we fuse it into the layer norm operation and weights in linear layers. In our experiments, RPTQ achieved a significant breakthrough by utilizing 3-bit activation in LLMs for the first time, resulting in a substantial reduction in memory usage. For instance, quantizing OPT-175b can lead to a memory consumption reduction of up to 80%.

10.48550/arxiv.2304.01089

Cite

Citations (13)

A corpus-based investigation of the lexis of the postgraduate engineering textbooks with reference to the needs of Southeast Asian students

Jiawei Liu

This research is mainly concerned with establishing the vocabulary learning needs and goals of the Engineering students from Southeast Asia studying at British universities. The research was motivated by the needs to enhance the reading skills of these students. Subtechnical and technical vocabulary are the focus of this investigation. The research is based on data derived from a 536,051 word corpus of text from recommended Engineering textbooks. The relative frequency and range of lexis within the corpus was found to be a good criterion for identifying subtechnical and technical vocabulary. The students proved to have a better receptive knowledge of subtechnical than technical vocabulary. The research suggests that there is a need for collaborative work between ESP teachers and subject teachers to help the students with technical vocabulary. The thesis is divided into nine chapters. Chapter One is a review of literature to the research. It clarifies various definitions and concepts, describes the research approach, and provides a framework of the thesis. Chapter Two investigates my subjects overall vocabulary knowledge. Chapter Three introduces some preliminary data that contrasts the received opinions in ESP regarding technical and subtechnical vocabulary. For further investigation of these two types of vocabulary, Chapter Four describes the data on which empirical studies are based. Chapter Five analyses the data. Chapter Six presents the empirical studies and concludes that students receptive knowledge of subtechnical vocabulary is better than their technical vocabulary. Chapter Seven examines the reasons why technical vocabulary was problematic. Chapter Eight summarises the research findings and proposes pedagogical implications in the teaching of subtechnical and technical vocabulary to the specified group of learners. And Chapter Nine draws conclusions, discusses limitations of the research and makes recommendations for future research.

Lexis

Empirical Research

Source

Cite

Citations (0)

Effects of Secondary Package on Freeze-Dried Biopharmaceutical Formulation Stability During Dropping

Journal of Pharmaceutical Sciences (2021)

Wei‐Jie Fang Jiawei Liu James G. Barnard Haibin Wang Yan-Chen Qian

Biopharmaceutical

Protein Stability

Degradation

10.1016/j.xphs.2021.04.019

Cite

Citations (4)

Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models

arXiv (Cornell University) (2024)

Zhuo Chen Jiawei Liu Haotan Liu Qikai Cheng Fan Zhang

Retrieval-Augmented Generation (RAG) is applied to solve hallucination problems and real-time constraints of large language models, but it also induces vulnerabilities against retrieval corruption attacks. Existing research mainly explores the unreliability of RAG in white-box and closed-domain QA tasks. In this paper, we aim to reveal the vulnerabilities of Retrieval-Enhanced Generative (RAG) models when faced with black-box attacks for opinion manipulation. We explore the impact of such attacks on user cognition and decision-making, providing new insight to enhance the reliability and security of RAG models. We manipulate the ranking results of the retrieval model in RAG with instruction and use these results as data to train a surrogate model. By employing adversarial retrieval attack methods to the surrogate model, black-box transfer attacks on RAG are further realized. Experiments conducted on opinion datasets across multiple topics show that the proposed attack strategy can significantly alter the opinion polarity of the content generated by RAG. This demonstrates the model's vulnerability and, more importantly, reveals the potential negative impact on user cognition and decision-making, making it easier to mislead users into accepting incorrect or biased information.

Black box

Box model

10.48550/arxiv.2407.13757

Cite

Citations (0)

Chlorogenic Acid Inhibits Ceramide Accumulation to Restrain Hepatic Glucagon Response

Nutrients (2023)

Na Xiao Tengfei Zhang Mingli Han Dan Tian Jiawei Liu

Chlorogenic acid (CGA), a dietary natural phenolic acid, has been widely reported to regulate glucose and lipid metabolism. However, the protective effects and the underlying mechanisms of CGA on glucagon-induced hepatic glucose production remain largely uncharacterized. Herein, we investigated the efficacy of CGA on hepatic gluconeogenesis both in vivo and in vitro. The elevated levels of endogenous glucose production induced by infusion of glucagon or pyruvate were lowered in mice administered with CGA. Furthermore, chronic CGA treatment ameliorated the accumulation of glucose and ceramide in high-fat diet (HFD)-fed mice. CGA also attenuated HFD-fed-induced inflammation response. The protective effect of CGA on glucose production was further confirmed in primary mouse hepatocytes by inhibiting accumulation of ceramide and expression of p38 MAPK. Moreover, CGA administration in HFD-fed mice preserved the decreased phosphorylation of Akt in the liver, resulting in the inhibition of FoxO1 activation and, ultimately, hepatic gluconeogenesis. However, these protective effects were significantly attenuated by the addition of C2 ceramide. These results suggest that CGA inhibits ceramide accumulation to restrain hepatic glucagon response.

Gluconeogenesis

Carbohydrate Metabolism

10.3390/nu15143173

Cite

Citations (0)

Existence and stability of circular orbits in general static and spherically symmetric spacetimes

General Relativity and Gravitation (2018)

Junji Jia Jiawei Liu Xionghui Liu Zhongyou Mo Xiankai Pang

Null (SQL)

Stationary spacetime

10.1007/s10714-017-2337-1

Cite