Computational Tools and Resources for Linguistic Studies

1997 
This paper presents several useful computational tools and available resources to facilitate linguistic studies. For each computational tool, we demonstrate why it is useful and how can it be used for research. In addition, linguistic examples are given for illustration. First, a very useful searching engine, Key Word in Context (KWIC), is introduced. This tool can automatically extract linguistically significant patterns from large corpora and help linguists discover syntagmatic generalizations. Second, Dynamic Clustering and Hierarchical Clustering are introduced for identifying natural clusters of words or phrases in distribution. Third, statistical measures which could be used to measure the degree of cohesion and correlation among linguistic units are presented. These tools can help linguists identify the boundaries of lexical units. Fourth, alignment tools for aligning parallel texts at the word, sentence and structure levels are presented for linguists who do comparative studies of different languages. Fifth, we introduce Sequential Forward Selection (SFS) and Classification and Regression Tree (CART) for automatic rule ordering. Finally, some available electronic Chinese resources are described to provide reference purposes for those who are interested.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    0
    Citations
    NaN
    KQI
    []