Feature Extraction of English Books on Tourism Using Data Mining

2021 
According to the White Paper on Tourism for 2020, before the outbreak of COVID-19, 20.08 million Japanese people traveled abroad, and 31.88 million foreigners came to Japan for sightseeing in 2019. It can be said that it had been the time of sightseeing. The knowledge of tourism has become more and more important, and reading materials in English has been indispensable. In this paper, several English books on tourism are investigated, compared with journalism in terms of metrical linguistics. In short, frequency characteristics of character- and word-appearance are investigated using a program written in C++. An exponential function is used to approximate these attributes.  Furthermore, the percentage of Japanese junior high school required vocabulary and American basic vocabulary is calculated to obtain the difficulty-level as well as the K-characteristic of each material. As a result, it is obvious that English materials for tourism have a similar trend in terms of character-appearance to literary writings. Furthermore, the K-characteristic values for tourism materials are high, and older publications with a higher specialty are more difficult to read than journalism. Moreover, as a whole, the book whose publication year is the newest has characteristics close to those for journalism.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []