An Interactive Independent Topic Analysis for a Mass Document Review Service

2018 
In this paper, we propose an interactive constrained independent topic analysis in text data mining. Independent topic analysis (ITA) is a method for extracting independent topics from document data using independent component analysis. In this independent topic analysis, the most independent topics between each topic are extracted. By extracting the independent topic, managing documents with a large number of text data is easy with document access support systems and document management systems. However, the topics extracted by ITA are often different from the topics a user requests. For the system to be of service to users, an interactive system that reflects the user’s requests is necessary. Thus, we propose an interactive ITA that works for the user. For example, if there are three topics, i.e., topic A, topic B, and topic C, and a user choose the content from topics A and B, a user can merge those topics into one topic D. In addition, if a user wants to analyze topic A in more detail, a user could separate topic A into topics E and topic F. To that end, we define Merge Link constraints and Separate Link constraints as user requests. The Merge Link constraint is a constraint that merges two topics into one topic. The Separate Link constraint is a constraint that separates two topics from one topic. In this paper, we propose a method for extracting a highly independent topic that meets these constraints. We conducted evaluation experiments on our proposed methods, and obtained results to show the effectiveness of our approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []