Development of an Automatic Trend Exploration System using the MuST Data Collection

2006 
The automatic extraction of trend information from text documents such as newspaper articles would be useful for exploring and examining trends. To enable this, we used data sets provided by a workshop on multimodal summarization for trend information (the MuST Workshop) to construct an automatic trend exploration system. This system first extracts units, temporals, and item expressions from newspaper articles, then it extracts sets of expressions as trend information, and finally it arranges the sets and displays them in graphs. For example, when documents concerning the politics are given, the system extracts "%" and "Cabinet approval rating" as a unit and an item expression including temporal expressions. It next extracts values related to "%". Finally, it makes a graph where temporal expressions are used for the horizontal axis and the value of percentage is shown on the vertical axis. This graph indicates the trend of Cabinet approval rating and is useful for investigating Cabinet approval rating. Graphs are obviously easy to recognize and useful for understanding information described in documents. In experiments, when we judged the extraction of a correct graph as the top output to be correct, the system accuracy was 0.2500 in evaluation A and 0.3334 in evaluation B. (In evaluation A, a graph where 75% or more of the points were correct was judged to be correct; in evaluation B, a graph where 50% or more of the points were correct was judged to be correct.) When we judged the extraction of a correct graph in the top five outputs to be correct, accuracy rose to 0.4167 in evaluation A and 0.6250 in evaluation B. Our system is convenient and effective because it can output a graph that includes trend information at these levels of accuracy when given only a set of documents as input.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    3
    Citations
    NaN
    KQI
    []