Methods for Intelligent Data Analysis Based on Keywords and Implicit Relations: The Case of "ISTINA" Data Analysis System

2019 
In information analysis systems that are working with big data, there often arises a need to classify objects and calculate the degree of thematic proximity between two objects. One of the natural sources of data for solving such problems are keywords that are attributed to objects of the system. In this paper, a model for calculating the degree of thematic proximity between two keywords as well as between two sets of keywords is described. This model is based on contextual proximity between keywords, which means the number of sets where the two keywords are present together. When calculating the final proximity coefficient, such properties of keywords as abstractness degree and thematic belonging are taken into account. Various ways to use the developed model for solving practical tasks are described, on the example of "ISTINA" scientometric data analysis system in Lomonosov Moscow State University.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    5
    Citations
    NaN
    KQI
    []