Computer Science & Technology

LDA Subtopic Detection Algorithm with Background Noise Restraint

Expand
  • 1.Institute of Computing Technology//Key Laboratory of Network Data Science and Technology,Chinese Academy of Sciences,Beijing 100190,China; 2.National Computer Network Emergency Response Technical Team Coordination Center of China,Beijing 100029,China
李静远( 1982-) ,男,高级工程师,主要从事在线社会网络信息传播与信息安全研究.

Received date: 2016-12-07

  Online published: 2017-02-02

Supported by

Supported by the National Natural Science Foundation of China( 61303244,61572473,61572469,61402442, 61402022, 61370132) and the National 242 Project of Information Security ( 2015F114)

Abstract

Special article set is a collection of articles with common background knowledge.In order to more effectively detect the subtopics form special article set with complex information correlation,an LDA subtopic detection algorithm with background noise restraintnamed BLDA is proposed,which improves the precision of subtopic detection from article set by firstly extracting the common background knowledge and then reproducing the keywords in each iteration step.By a series of experiments on a set of WeChat documents from public accounts,it is proved that the detection results obtained by BLDA are much better than those obtained by LDA,with a topic recall rate increment of about 170%,a Purity index increment of 143% and a NMI index increment of 160%.

Cite this article

LI Jing-yuan QIU Zhi-jie LIU Yue CHENG Xue-qi REN Yan . LDA Subtopic Detection Algorithm with Background Noise Restraint[J]. Journal of South China University of Technology(Natural Science), 2017 , 45(3) : 54 -60 . DOI: 10.3969/j.issn.1000-565X.2017.03.008

Outlines

/