MedLex+: An Integrated Corpus-Lexicon Medical Workbench for Swedish
2008
This paper reports on the work carried out developing MedLex+, a medical corpuslexicon
workbench for Swedish. This project, which is still under active development,
has been going on for some years now within the Department of Swedish language at
Goteborg University. At the moment, the workbench incorporates:
- an annotated collection of medical texts-including 20 million tokens and 45,000
documents,
- a number of language processing software programs, including tools for collocation
extraction, compound segmentation and thesaurus-based semantic annotation, and
- a lexical database of medical terms-containing 5,000 medical entries. MedLex+ is a
multifunctional lexical resource due to a structural design and content which can be
easily queried. The medical workbench is intended to support lexicographers
compiling lexicons and also lexicon users more or less initiated in the medical
domain. MedLex+ can also assist researchers working on either lexical semantics
or natural language processing (NLP) applications with focus on medical language.
The linguistically and semantically annotated medical texts in combination with a
set of smart queries turn the corpora into a rich repository of semasiological and
onomasiological knowledge about medical terms and their linguistic, lexical and
pragmatic properties. These properties are recorded in the lexical database with a
cognitive profile. The MedLex+ workbench seems to offer a constructive help in
many different lexical tasks.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
10
References
1
Citations
NaN
KQI