Journal of South China University of Technology (Natural Science Edition) ›› 2016, Vol. 44 ›› Issue (5): 130-136.doi: 10.3969/j.issn.1000-565X.2016.05.020

• Computer Science & Technology • Previous Articles     Next Articles

OLDA-Based Model for Hot Topic Evolution and Tracking

CHEN Xing-shu GAO Yue JIANG Hao DU Min WANG Hai-zhou HE Jian-yun   

  1. College of Computer//Network and Trusted Computing Institute,Sichuan University,Chengdu 610065,Sichuan,China
  • Received:2015-02-11 Revised:2015-08-21 Online:2016-05-25 Published:2016-04-12
  • Contact: 王海舟(1986-),男,讲师,主要从事数据挖掘、计算机网络、分布式系统研究. E-mail:whzh.nc@scu.edu.cn
  • About author:陈兴蜀(1969-),女,教授,博士生导师,主要从事信息安全、云计算安全研究. E-mail:chenxsh@ scu. edu. cn
  • Supported by:
    Supported by the National Science and Technology Support Program of China(2012BAH18B05)and the National Natural Science Foundation of China(61272447)

Abstract: In order to detect and track interesting topics from massive forum data,firstly,LDA (Latent Dirichlet Allocation) topic model is used to reduce the dimensionality of text data from word space to semantic space.Seco- ndly,a clustering algorithm is employed to cluster the forum data in semantic space.Then,a detection method is proposed to obtain hot topics on the basis of which HTOLDA (Hot-Topic OLDA) topic model is proposed on the basis of OLDA (Online LDA) topic model,which performs priori delivery by choosing hot topics and sets semantic distance on the same topic of adjacent time slices to judge topic status.Experimental results show that HTOLDA topic model is superior to OLDA topic model in terms of modeling each time slice,and that it evolves and tracks the hot topics in online forums effectively.

Key words: text processing, LDA topic model, topic evolution, topic tracking, HTOLDA topic model

CLC Number: