Corpus-based identification and disambiguation of reading indicators for German nominalizations
6
Citation
0
Reference
20
Related Paper
Citation Trend
Keywords:
Nominalization
Identification
Cite
1. Introduction 1.1 Book outline 1.2 How to use the DVD 2. Corpus linguistics and translation studies 2.1 A typology of translation-driven corpora 2.2 Corpus-based translation research 2.2.1 Regularities of translations 2.2.1.1 Simplification 2.2.1.2 Explicitation 2.2.1.3 Standardization 2.2.1.4 Translation of unique items 2.2.1.5 Untypical collocations 2.2.1.6 Interference 2.2.2 Regularities of translators 2.2.3 Regularities of languages 2.2.4 Learner translation corpora 2.2.5 Interpreting and multimodal corpora 2.3 Corpus-based translation teaching and learning 2.4 Computer-assisted translation and computational linguistics 2.5 Tasks 2.5.1 Experimenting with the TEC 2.5.2 Experimenting with COMPARA 2.5.3 Experimenting with the LTC 2.6 Further reading 3. Corpus design and acquisition 3.1 Corpus design 3.1.1 Size 3.1.2 Composition 3.1.3 Representativeness and comparability 3.1.4 Case study: the CEXI corpus 3.2 Corpus acquisition and copyright 3.3 Web corpora 3.3.1 The Web as corpus 3.3.2 The Web as a source of corpora 3.3.2.1 General Web corpora 3.3.2.2 Specialized Web corpora 3.4 Conclusions 3.5 Tasks 3.5.1 Corpus building project outline 3.5.2 Manual creation of a DIY monolingual corpus 3.5.3 Automatic creation of a DIY bilingual comparable corpus 3.6 Further reading 4. Corpus encoding and annotation 4.1 Corpus-based translation studies and corpus annotation 4.2 Annotation for descriptive translation studies 4.2.1 Documentary information 4.2.2 Structural information 4.2.3 Text-linguistic information 4.3 Stand-off annotation 4.4 Conclusions 4.5 Tasks 4.5.1 Creating an XML TEI document 4.5.2 Adding a simple header 4.5.3 Marking-up text structure 4.5.4 Adding linguistic annotation 4.5.5 Indexing the corpus 4.5.6 Searching the corpus 4.6 Further reading 5. Corpus tools and corpus analysis 5.1 Corpus creation and analysis tools 5.1.1 Text acquisition 5.1.2 Annotation 5.1.3 Corpus management and query systems 5.1.4 Data retrieval and display 5.2 Analysis of corpus data 5.2.1 Wordlists and basic statistics 5.2.2 Concordances 5.2.3 Collocations, clusters and clouds 5.2.4 Colligations and word profiles 5.2.5 Semantic associations 5.3 Conclusions 5.4 Tasks 5.4.1 Wordlists 5.4.2 Lists of lemmas 5.4.3 Keywords 5.4.4 Concordances 5.4.5 Collocations and clusters 5.4.6 Word profiles 5.5 Further reading and software 6. Creating multilingual corpora 6.1 Corpus acquisition 6.1.1 Comparable corpora 6.1.2 Parallel corpora 6.2 Alignment 6.2.1 Paragraphs and sentences 6.2.2 Approaches and tools 6.3 Case study: the OPUS corpus 6.4 Parallel corpora and translation memories 6.5 Alignment below sentence level 6.5.1 Alignment of comparable corpora 6.5.2 Word alignment 6.6 Tasks 6.6.1 Aligning a text pair 6.6.2 A parallel corpus of literary texts 6.6.3 Corpus creation checklist 6.7 Further reading and software 7. Using multilingual corpora 7.1 Comparable and parallel corpora 7.2 Display and analysis of parallel corpora 7.3 Case study: The Rushdie English-Italian parallel corpus 7.4 Case study: the OPUS Word alignment database 7.5 Multilingual corpora in translator training and practice 7.6 Tasks 7.6.1 Searching a parallel corpus of literary texts 7.6.2 Exploring the Europarl multilingual corpus 7.7 Further reading 8. Conclusions
Corpus Linguistics
Text corpus
Comparability
Computational linguistics
Cite
Citations (163)
This article presents an annotated corpus of Turkish comment texts gathered from employees. Special attention is given to neutrality of paragraphs in the corpus and quality of the annotation. We employ the majority voting of the annotators. We describe the details of the dataset, the annotation methodology and the experiments with basic methods to investigate the corpus. The corpus has three classes, positive, negative and neutral.
Neutrality
Corpus Linguistics
Semantic annotation
Text corpus
Cite
Citations (0)
Text corpus
Corpus Linguistics
Cite
Citations (1)
The use of large, computerized bodies of text for linguistic analysis and description has emerged in recent years as one of the most significant and rapidly-developing fields of activity in the study of language. This book provides a comprehensive introduction and guide to Corpus Linguistics. All aspects of the field are explored, from the various types of electronic corpora that are available to instructions on how to design and compile a corpus. Graeme Kennedy surveys the development of corpora for use in linguistic research, looking back to the pre-electronic age as well as to the massive growth of computer corpora in the electronic age.
Corpus Linguistics
Cite
Citations (703)
Corpus of Modern Greek appeared in 2011. All texts are morphologically annotated. Due to certain peculiarities of Modern Greek morphology, the majority of forms has more than one grammatic interpretation. In this presentation we describe the types of homonyms which are found in the Corpus and discuss possible patterns for automatic disambiguation. At the end, we mention a number of problematic cases that cannot be resolved now or require manual approach.
Presentation (obstetrics)
Modern Greek
Cite
Citations (0)
Lemma (botany)
Identification
Treebank
Cite
Citations (2)
Corpus Linguistics
European Portuguese
Agreement
Realization (probability)
Text corpus
Cite
Citations (1)
Computational linguistics
Word Sense Disambiguation
Relevance
Cite
Citations (10)
The Norwegian Spanish Parallel Corpus (NSPC) was created at the University of Bergen, Norway. The corpus is primarily constructed for research in Translation Studies, and is built to be roughly comparable to the Spanish-English P-ACTRES corpus. The NSPC is a parallel, unidirectional translation corpus of contemporary Norwegian written texts translated into Spanish, published between 2000 and 2009. It contains fiction and non-fiction, and each text is classified according to genre, the author's gender and the gender and mother tongue of the translator.
Norwegian
Parallel corpora
Corpus Linguistics
Translation studies
Cite
Citations (1)
Lexicography
Corpus Linguistics
Cite
Citations (0)