文献“大数据”分析软件Citespace和SCI2的对比分析研究 The Comparison of “Big Data” Literatures Analysis Tools: Taking Citespace and SCI2 as Examples

当今世界,信息爆炸已经累积到了一个可以引发变革的时刻,对学术研究者而言,知识更新迭代周期不断缩短,然而海量、多元的数据并不等于精准的数据贮备。如何快速精准获取重要学术资源,掌握研究前沿、热点与脉络是大数据时代学术研究者共同面临的挑战。本文以Citespace和Science of Science  (SCI2) Tool为例,从数据源、数据预处理、数据可视化几方面总结了二者在文献数据挖掘中的优势互补作用,并加以地震实例分析。二者互补情况主要有以下几点:1) 数据源。Citespace可以分析中英文文献,而SCI2只限英文文献,不过英文数据源更广;2) 数据预处理。SCI2能自动合并拼写相近的单词,但有时会出现误合或漏合的情况;而Citespace则需人工添加同义词对照表,可以避免这种情况,但处理工作量较大;3) 网络输出。SCI2输出的网络可编辑能力强;而Citespace编辑能力相对较差,不过其剪枝功能使得剪枝后的网络研究脉络更清晰;4) 可视化方式。SCI2除无向网络外还可生成有向网络,有向网络能更好地反映研究之间的指向关系;Citespace的网络节点除能表达出现次数多少之外,还能给出节点中心度,并且Citespace的时域图能更好地反映研究随时间变化的特点。 Recently, the information is too much to induce revolution. For researchers, the speed of update of knowledge is faster and faster. However, the mass and various data do not mean high accurate. It is a big challenge to get the important academic resources quickly and accurately and master research’s skeleton, frontiers and hot topics. This article took the Citespace and SCI2 as example, then summarized their complementary roles in data acquisition, data preprocessing, data visualization. After that an earthquake case was shown. Their complementarities are as follows: a) Data sources. Citespace can analyze both Chinese and English articles. SCI2 is just for English articles and it has more English data source than Citespace. b) Data preprocessing. SCI2 can merge the synonyms automatically but may miss merged. And as Citespace need a lot of manual work to build a synonym table, it is laborious. c) Network output. The network outputted by SCI2 is editable. However, the network from Citespace cannot do it. But the network of citespace is clear due to its path finder function. d) Visualization. Besides undirected network, SCI2 can output directed network which would express the directed relationship between nodes. The nodes in network of Citespace can express the centrality beside the occurrence frequency. Time zone map is another advantage of Citespace, it could reflect research hot topics changes with the time.
