logo
    Abstract:
    This article describes our novel approach to the automated detection and analysis of metaphors in text. We employ robust, quantitative language processing to implement a system prototype combined with sound social science methods for validation. We show results in 4 different languages and discuss how our methods are a significant step forward from previously established techniques of metaphor identification. We use Topical Structure and Tracking, an Imageability score, and innovative methods to build an effective metaphor identification system that is fully automated and performs well over baseline.
    Keywords:
    Identification
    In this paper we present TroFi (Trope Finder), a system for automatically classifying literal and nonliteral usages of verbs through nearly unsupervised word-sense disambiguation and clustering techniques. TroFi uses sentential context instead of selectional constraint violations or paths in semantic hierarchies. It also uses literal and nonliteral seed sets acquired and cleaned without human supervision in order to bootstrap learning. We adapt a word-sense disambiguation algorithm to our task and augment it with multiple seed set learners, a voting schema, and additional features like SuperTags and extrasentential context. Detailed experiments on hand-annotated data show that our enhanced algorithm outperforms the baseline by 24.4%. Using the TroFi algorithm, we also build the TroFi Example Base, an extensible resource of annotated literal/nonliteral examples which is freely available to the NLP research community.
    Literal (mathematical logic)
    Schema (genetic algorithms)
    Citations (198)
    In this paper we propose algorithms to automatically classify sentences into metaphoric or normal usages. Our algorithms only need the WordNet and bigram counts, and does not require training. We present empirical results on a test set derived from the Master Metaphor List. We also discuss issues that make classification of metaphors a tough problem in general.
    Bigram
    Training set
    Empirical Research
    Citations (138)
    We present the CSF - Common Semantic Features method for metaphor detection. This method has two distinguishing characteristics: it is cross-lingual and it does not rely on the availability of extensive manually-compiled lexical resources in target languages other than English. A metaphor detecting classifier is trained on English samples and then applied to the target language. The method includes procedures for obtaining semantic features from sentences in the target language. Our experiments with Russian and English sentences show comparable results, supporting our hypothesis that a CSF-based classifier can be applied across languages. We obtain state-ofthe-art performance in both languages.
    Semantic role labeling
    Citations (70)
    People use every time they speak. Some of those are literary - devices for making thoughts more vivid or entertaining. But most are much more basic than that - they're metaphors we live by, we use without even realizing we're using them. In this book, George Lakoff and Mark Johnson suggest that these basic not only affect the way we communicate ideas, but actually structure our perceptions and understandings from the beginning. Bringing together the perspectives of linguistics and philosophy, Lakoff and Johnson offer an intriguing and surprising guide to some of the most common and what they can tell us about the human mind. And for this new edition, they supply an afterword both extending their arguments and offering a fascinating overview of the current state of thinking on the subject of the metaphor.
    George (robot)
    Affect
    Citations (17,049)
    Metaphors are ubiquitous in language and developing methods to identify and deal with metaphors is an open problem in Natural Language Processing (NLP). In this paper we describe results from using a maximum entropy (ME) classifier to identify metaphors. Using the Wall Street Journal (WSJ) corpus, we annotated all the verbal targets associated with a set of frames which includes frames of spatial motion, manipulation, and health. One surprising finding was that over 90% of annotated targets from these frames are used metaphorically, underscoring the importance of processing figurative language. We then used this labeled data and each verbal target's PropBank annotation to train a maximum entropy classifier to make this literal vs. metaphoric distinction. Using the classifier, we reduce the final error in the test set by 5% over the verb-specific majority class baseline and 31% over the corpus-wide majority class baseline.
    Training set
    Citations (98)
    We present a novel approach to automatic metaphor identification in unrestricted text. Starting from a small seed set of manually annotated metaphorical expressions, the system is capable of harvesting a large number of metaphors of similar syntactic structure from a corpus. Our method is distinguished from previous work in that it does not employ any hand-crafted knowledge, other than the initial seed set, but, in contrast, captures metaphoricity by means of verb and noun clustering. Being the first to employ unsupervised methods for metaphor identification, our system operates with the precision of 0.79.
    Identification
    Nominalization
    Citations (158)
    We show that it is possible to reliably discriminate whether a syntactic construction is meant literally or metaphorically using lexical semantic features of the words that participate in the construction. Our model is constructed using English resources, and we obtain state-of-the-art performance relative to previous work in this language. Using a model transfer approach by pivoting through a bilingual dictionary, we show our model can identify metaphoric expressions in other languages. We provide results on three new test sets in Spanish, Farsi, and Russian. The results support the hypothesis that metaphors are conceptual, rather than lexical, in nature.
    Citations (225)
    We aim to investigate cross-cultural patterns of thought through cross-linguistic investigation of the use of metaphor. As a first step, we produce a system for locating instances of metaphor in English and Spanish text. In contrast to previous work which relies on resources like syntactic parsing and WordNet, our system is based on LDA topic modeling, enabling its application even to low-resource languages, and requires no labeled data. We achieve an F-score of 59% for English.
    Citations (49)
    This book presents a complete method for the identification of metaphor in language at the level of word use. It is based on extensive methodological and empirical corpus-linguistic research in two languages, English and Dutch. The method is formulated as an explicit manual of instructions covering one chapter, the method being a development and refinement of the popular MIP procedure presented by the Pragglejaz Group in 2007. The extended version is called MIPVU, as it was developed at VU University Amsterdam. Its application is demonstrated in five case studies addressing metaphor in English news texts, conversations, fiction, and academic texts, and Dutch news texts and conversations. Two methodological chapters follow reporting a series of successful reliability tests and a series of post hoc troubleshooting exercises. The final chapter presents a first empirical analysis of the findings, and shows what this type of methodological attention can mean for research and theory.
    Identification
    Trouble shooting
    Empirical Research
    Citations (1,090)