Journal of South China University of Technology (Natural Science Edition) ›› 2011, Vol. 39 ›› Issue (4): 21-25.doi: 10.3969/j.issn.1000-565X.2011.04.004

• Computer Science & Technology • Previous Articles     Next Articles

Prediction of Search Data Volume Based on Time-Series Clustering and ARMA Models

Sun Cheng-jie  Liu Feng  Lin Lei  Liu Bing-quan   

  1. School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,Heilongjiang,China
  • Received:2011-01-10 Online:2011-04-25 Published:2011-03-01
  • Contact: 孙承杰(1980-) ,男,博士,讲师,主要从事文本挖掘研究. E-mail:cjsun@ insun.hit.edu.cn
  • About author:孙承杰(1980-) ,男,博士,讲师,主要从事文本挖掘研究.
  • Supported by:

    国家自然科学基金资助项目( 60973076,61073127) ; 哈尔滨工业大学中央高校基本科研业务费专项资金资助项目( HIT.NSRIF.2010045)

Abstract:

In order to guide the adjustment of product development and business strategy by predicting and analyzing the search data volume,the data of search volume are organized into time series that is modeled and predicted using the autoregressive moving average ( ARMA) models. Then,the set of time series is modeled by clustering; the cluster centers are modeled using ARMA models; and the same-class series is fitted with the models approximately to obtain the predicted values. Moreover,after such operations as data preprocessing,similarity analysis,similarity-based clustering and time-series prediction,the search data volume is predicted and is compared with the actual one. Experimental results show that it is feasible and accurate to model similar time series with the same ARMA model. In addition,clustering results indicate that the search data volume of the products with the same brand tends to be clustered together,which provides a reference for the relationship mining of search terms.

Key words: time series, search data volume, ARMA model, dynamic time-warping distance, k-medoid algorithm