Journal of South China University of Technology(Natural Science Edition) ›› 2021, Vol. 49 ›› Issue (1): 18-28.doi: 10.12141/j.issn.1000-565X.200489

Special Issue: 2021年计算机科学与技术

• Computer Science & Technology • Previous Articles     Next Articles

Rumor Identification in Major Sudden Epidemic Situation 

LIU Kan1 HUANG Zheying2   

  1. 1. School of Information and Safety Engineering,Zhongnan University of Economics and Law,Wuhan 430073,Hubei,China; 2. School of Business,Nankai University,Tianjin 300071,China
  • Received:2020-08-14 Revised:2020-10-09 Online:2021-01-25 Published:2021-01-01
  • Contact: 刘勘 ( 1970-) ,男,博士,教授,主要从事机器学习和数据挖掘、社交网络与舆情分析研究。 E-mail:liukan@zuel.edu.cn
  • About author:刘勘 ( 1970-) ,男,博士,教授,主要从事机器学习和数据挖掘、社交网络与舆情分析研究。
  • Supported by:
    Supported by the General Program of the National Natural Science Foundation of China ( 71573196)

Abstract: Since the outbreak of the covid-19 epidemic,related rumors have spread rampantly. Traditional rumor identification models have difficulties in epidemic rumor identification because the size of epidemic rumors is not large enough to train a good classification and identification model. Therefore,it is an urgent task to build a rumor identification model based on a small amount of epidemic rumor data. To deal with the problem of insufficient training data,text enhancement and generative adversarial networks ( GAN) methods were used to generate a large amount of information similar to epidemic rumors and to improve the identification effect of epidemic rumors. First, the textual characteristics was analyzed to extract keyword of epidemic rumors. Second,epidemic rumor generation model was constructed based on the idea of GAN,and historical rumors which do not contain epidemic rumor features were textually enhanced by the epidemic rumor feature thesaurus,and a large amount of new rumor data containing epidemic rumor features were generated. Finally,the newly generated rumor data are combined with the epidemic rumor data to train a more accurate classification model of the epidemic rumor. Experiment results show that the rumor identification effect is improved by 3% after using the GAN extended training set. The new model is evidently much better than the traditional machine learning and deep learning algorithms,and it provides a new way for the identification of rumors in public health emergency.

Key words: covid-19 epidemic, rumor identification, generation model, text enhancement

CLC Number: