Perceptual-IQ: Visual Commonsense Reasoning about Perceptual Imagination
1
Citation
15
Reference
10
Related Paper
Citation Trend
Abstract:
In this paper, we present a new dataset Perceptual Imagination: Question-Answering (Perceptual-IQ) to evaluate the visual systems' commonsense reasoning ability when confronted with perceptual changes. In our dataset, the machines are given a question that includes a perceptual change over an image and they have to predict human response to the change. Perceptual-IQ consists of 3.7K manually annotated QA pairs from 1.6K curated images and covers various types of perceptual changes. Through the evaluation of vision-language models with Perceptual-IQ, we identify the performance gap (~25%) with human performance.Keywords:
Commonsense reasoning
Commonsense knowledge
Perceptual system
Leveraging external knowledge to enhance the reasoning ability is crucial for commonsense question answering. However, the existing knowledge bases heavily rely on manual annotation which unavoidably causes deficiency in coverage of world-wide commonsense knowledge. Accordingly, the knowledge bases fail to be flexible enough to support the reasoning over diverse questions. Recently, large-scale language models (LLMs) have dramatically improved the intelligence in capturing and leveraging knowledge, which opens up a new way to address the issue of eliciting knowledge from language models. We propose a Unified Facts Obtaining (UFO) approach. UFO turns LLMs into knowledge sources and produces relevant facts (knowledge statements) for the given question. We first develop a unified prompt consisting of demonstrations that cover different aspects of commonsense and different question styles. On this basis, we instruct the LLMs to generate question-related supporting facts for various commonsense questions via prompting. After facts generation, we apply a dense retrieval-based fact selection strategy to choose the best-matched fact. This kind of facts will be fed into the answer inference model along with the question. Notably, due to the design of unified prompts, UFO can support reasoning in various commonsense aspects (including general commonsense, scientific commonsense, and social commonsense). Extensive experiments on CommonsenseQA 2.0, OpenBookQA, QASC, and Social IQA benchmarks show that UFO significantly improves the performance of the inference model and outperforms manually constructed knowledge sources.
Commonsense knowledge
Commonsense reasoning
General knowledge
Cite
Citations (0)
Lack of commonsense is one of the most challenging problems in the field of conversational AI.Despite the recent significant progress in NLP driven by pre-trained language models, commonsense reasoning is still out of reach.We propose an approach to evaluate conversational commonsense usage.We use the approach to evaluate conversational skills of the socialbot during interaction with users.Analysis of data with joint manual and automatic annotations allowed us to identify automatic metrics tied to commonsense.We also develop two commonsense conversational skills that combine commonsense knowledge graphs completion model COMeT [6] and templatebased approach.
Commonsense knowledge
Commonsense reasoning
Cite
Citations (3)
AI has seen great advances of many kinds recently, but there is one critical area where progress has been extremely slow: ordinary commonsense.
Commonsense knowledge
Commonsense reasoning
Cite
Citations (385)
Commonsense knowledge is fundamental to make machines reach human-level intelligence. However, conventional methods of commonsense extraction generally do not work well because commonsense by nature is usually not explicitly stated in texts or other data. Besides, commonsense knowledge graphs built in advance are difficult to cover all the knowledge required for practical tasks due to the incompleteness of knowledge graphs. In this paper, we propose an online commonsense oracle to achieve knowledge reasoning. Specifically, we focus on the on-demand inference of specific commonsense propositions. We use capableOf relation as an example due to its notable significance in daily life. For more effective capableOf reasoning, informative supporting features derived from an existing commonsense knowledge graph and a Web search engine are exploited. Finally, we conduct extensive experiments, and the results demonstrate the effectiveness of our approach.
Commonsense knowledge
Commonsense reasoning
Knowledge graph
Cite
Citations (1)
The field of Computational Argumentation is well-tailored to approach commonsense reasoning, due to its ability to model contradictory information. In this paper, we present preliminary work on how an argumentation framework can explicitly model commonsense knowledge, both at a logically structured and at an abstract level. We discuss the correlation with current research and present interesting future directions.
Commonsense reasoning
Commonsense knowledge
Cite
Citations (6)
Commonsense is a challenge not only for representation and reasoning but also for large scale knowledge engineering required to capture the breadth of our everyday world. One approach to knowledge engineering is to outsource the effort to the public through games that generate structured commonsense knowledge from user play. To date, such games have focused on symbolic and textual knowledge. However, an effective commonsense reasoning system will require spatial and physical reasoning capabilities. In this paper, I propose a tool for gathering commonsense information from ordinary people. It is a user-friendly 3D sculpting tool for modeling and annotating models of physical objects and spaces.
Commonsense reasoning
Commonsense knowledge
Interface (matter)
Spatial intelligence
Representation
Crowdsourcing
Cite
Citations (0)
More than one hundred benchmarks have been developed to test the commonsense knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems. However, these benchmarks are often flawed and many aspects of common sense remain untested. Consequently, we do not currently have any reliable way of measuring to what extent existing AI systems have achieved these abilities. This paper surveys the development and uses of AI commonsense benchmarks. We discuss the nature of common sense; the role of common sense in AI; the goals served by constructing commonsense benchmarks; and desirable features of commonsense benchmarks. We analyze the common flaws in benchmarks, and we argue that it is worthwhile to invest the work needed ensure that benchmark examples are consistently high quality. We survey the various methods of constructing commonsense benchmarks. We enumerate 139 commonsense benchmarks that have been developed: 102 text-based, 18 image-based, 12 video based, and 7 simulated physical environments. We discuss the gaps in the existing benchmarks and aspects of commonsense reasoning that are not addressed in any existing benchmark. We conclude with a number of recommendations for future development of commonsense AI benchmarks.
Commonsense reasoning
Commonsense knowledge
Benchmark (surveying)
Common sense
Cite
Citations (4)
A long-standing dream of artificial intelligence has been to put commonsense knowledge into computers -- enabling machines to reason about everyday life. Some projects, such as Cyc, have begun to amass large collections of such knowledge. However, it is widely assumed that the use of common sense in interactive applications will remain impractical for years, until these collections can be considered sufficiently complete and commonsense reasoning sufficiently robust. Recently, at the Massachusetts Institute of Technology's Media Laboratory, we have had some success in applying commonsense knowledge in a number of intelligent interface agents, despite the admittedly spotty coverage and unreliable inference of today's commonsense knowledge systems. This article surveys several of these applications and reflects on interface design principles that enable successful use of commonsense knowledge.
Commonsense knowledge
Commonsense reasoning
Common sense
Interface (matter)
Cite
Citations (151)
Current commonsense reasoning research focuses on developing models that use commonsense knowledge to answer multiple-choice questions. However, systems designed to answer multiple-choice questions may not be useful in applications that do not provide a small list of candidate answers to choose from. As a step towards making commonsense reasoning research more realistic, we propose to study open-ended commonsense reasoning (OpenCSR) -- the task of answering a commonsense question without any pre-defined choices -- using as a resource only a corpus of commonsense facts written in natural language. OpenCSR is challenging due to a large decision space, and because many questions require implicit multi-hop reasoning. As an approach to OpenCSR, we propose DrFact, an efficient Differentiable model for multi-hop Reasoning over knowledge Facts. To evaluate OpenCSR methods, we adapt several popular commonsense reasoning benchmarks, and collect multiple new answers for each test question via crowd-sourcing. Experiments show that DrFact outperforms strong baseline methods by a large margin.
Commonsense reasoning
Commonsense knowledge
Cite
Citations (0)
Leveraging external knowledge to enhance the reasoning ability is crucial for commonsense question answering. However, the existing knowledge bases heavily rely on manual annotation which unavoidably causes deficiency in coverage of world-wide commonsense knowledge. Accordingly, the knowledge bases fail to be flexible enough to support the reasoning over diverse questions. Recently, large-scale language models (LLMs) have dramatically improved the intelligence in capturing and leveraging knowledge, which opens up a new way to address the issue of eliciting knowledge from language models. We propose a Unified Facts Obtaining (UFO) approach. UFO turns LLMs into knowledge sources and produces relevant facts (knowledge statements) for the given question. We first develop a unified prompt consisting of demonstrations that cover different aspects of commonsense and different question styles. On this basis, we instruct the LLMs to generate question-related supporting facts for various commonsense questions via prompting. After facts generation, we apply a dense retrieval-based fact selection strategy to choose the best-matched fact. This kind of facts will be fed into the answer inference model along with the question. Notably, due to the design of unified prompts, UFO can support reasoning in various commonsense aspects (including general commonsense, scientific commonsense, and social commonsense). Extensive experiments on CommonsenseQA 2.0, OpenBookQA, QASC, and Social IQA benchmarks show that UFO significantly improves the performance of the inference model and outperforms manually constructed knowledge sources.
Commonsense knowledge
Commonsense reasoning
General knowledge
Cite
Citations (0)