Assigning NDLSH Headings to People on the Web

2018 
We investigate a method that assigns National Diet Library Subject Headings (NDLSH) to the results of web people searches to help users select and understand people on the web. NDLSH is a controlled subject vocabulary list compiled and maintained by the National Diet Library (NDL) as a subject access tool. By assigning NDLSH headings to people, well-formed keywords can be assigned, and exploratory searches using related terms are possible. We examined the following combination of factors: (a) web-page rank (the number of pages), (b) position inside the HTML, (c) synonyms, and (d) document frequency. We report our experimental results for 405 combination patterns (\(5 \times 9 \times 3 \times 3\)) using our 80-person dataset. Overall, under our experimental settings, the best combination was (a) the top ten pages, (b) 100 characters before and after a person’s name (i.e., 200 characters), (c) half weight for synonyms, and (d) document frequency divided by number of web pages.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []