Journal of South China University of Technology (Natural Science Edition) ›› 2017, Vol. 45 ›› Issue (3): 54-60.doi: 10.3969/j.issn.1000-565X.2017.03.008

• Computer Science & Technology • Previous Articles     Next Articles

LDA Subtopic Detection Algorithm with Background Noise Restraint

LI Jing-yuan1 QIU Zhi-jie1 LIU Yue1 CHENG Xue-qi1 REN Yan2   

  1. 1.Institute of Computing Technology//Key Laboratory of Network Data Science and Technology,Chinese Academy of Sciences,Beijing 100190,China; 2.National Computer Network Emergency Response Technical Team Coordination Center of China,Beijing 100029,China
  • Received:2016-12-07 Online:2017-03-25 Published:2017-02-02
  • Contact: 李静远( 1982-) ,男,高级工程师,主要从事在线社会网络信息传播与信息安全研究. E-mail:lijingyuan@ict.ac.cn
  • About author:李静远( 1982-) ,男,高级工程师,主要从事在线社会网络信息传播与信息安全研究.
  • Supported by:
    Supported by the National Natural Science Foundation of China( 61303244,61572473,61572469,61402442, 61402022, 61370132) and the National 242 Project of Information Security ( 2015F114)

Abstract: Special article set is a collection of articles with common background knowledge.In order to more effectively detect the subtopics form special article set with complex information correlation,an LDA subtopic detection algorithm with background noise restraintnamed BLDA is proposed,which improves the precision of subtopic detection from article set by firstly extracting the common background knowledge and then reproducing the keywords in each iteration step.By a series of experiments on a set of WeChat documents from public accounts,it is proved that the detection results obtained by BLDA are much better than those obtained by LDA,with a topic recall rate increment of about 170%,a Purity index increment of 143% and a NMI index increment of 160%.

Key words: subtopic mining, linear discriminant analysis, background noise restraint