Journal of South China University of Technology (Natural Science Edition) ›› 2017, Vol. 45 ›› Issue (3): 48-53.doi: 10.3969/j.issn.1000-565X.2017.03.007

• Computer Science & Technology • Previous Articles     Next Articles

Resource Selection Algorithm on the Basis of Topic Model

DONG Shou-bin XIE Yi-fan YUAN Hua CHEN Jian-hao   

  1. School of Computer Science and Engineering / /Computation &Computer Network Laboratory of Guangdong Province,South China University of Technology,Guangzhou 510006,Guangdong,China
  • Received:2016-11-27 Online:2017-03-25 Published:2017-02-02
  • Contact: 袁华( 1969-) ,女,博士,副教授,主要从事信息检索研究. E-mail:hyuan@scut.edu.cn
  • About author:董守斌( 1967-) ,女,博士,教授,主要从事信息检索与高性能计算研究. E-mail: sbdong@ scut. edu. cn
  • Supported by:
    Supported by the Significant Fundamental Cultivate Project of Guangdong Province Natural Science Foundation( 2015A030308017) and the Scientific Research Joint Funds of Ministry of Education of China and China Mobile( MCM20150512)

Abstract: In the federated search environment with multiple real search engines,the small-document approach,which is inefficient in estimating the accurate number of indexed files in the process of resource description,may result in poor performance of resource selection methods.In order to solve this problem,a resource library description method on the basis of topic model is proposed,which adopts LDA topic model to obtain the description word of each resource library.Then,a new resource selection algorithm is proposed,which combines with both vertical weight and word vector to calculate the correlation between resource library and query request,and to obtain the final resource selection results according to the correlation.Experimental results show that the proposed resource selection algorithm on the basis of topic model improves the performance of resource selection and can be effectively applied in the federated search environment of distributed search engines.

Key words: distributed search, resource selection, topic model, vertical domain, word vector