计算机科学与技术

一种基于关键词的网页搜索结果多样化方法

展开
  • 华南理工大学 计算机科学与工程学院,广东 广州 510006
林古立(1984-) ,男,博士生,主要从事信息检索、数据挖掘、机器学习研究.

收稿日期: 2010-06-21

  修回日期: 2010-07-10

  网络出版日期: 2011-04-01

基金资助

广东省自然科学基金资助项目( 07006474, 9451064101003233) ; 广东省科技攻关项目( 2007B010200044) ; 华南理工大学中央高校基本科研业务费专项资金资助项目( 2009ZM0125, 2009ZM0189)

A Keyword-Based Method for Diversification of Web Search Results

Expand
  • School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China
林古立(1984-) ,男,博士生,主要从事信息检索、数据挖掘、机器学习研究.

Received date: 2010-06-21

  Revised date: 2010-07-10

  Online published: 2011-04-01

Supported by

广东省自然科学基金资助项目( 07006474, 9451064101003233) ; 广东省科技攻关项目( 2007B010200044) ; 华南理工大学中央高校基本科研业务费专项资金资助项目( 2009ZM0125, 2009ZM0189)

摘要

网页搜索结果的多样化已经逐渐成为提高网页搜索效率和用户满意度的一个重要因素.文中将多样化问题形式化为信息面覆盖率的最大化问题,提出了一种基于关键词的网页搜索结果多样化方法KDM.该方法首先从与用户查询相关的结果文档中提取出可以描述结果文档所蕴含的信息面的关键词,然后根据关键词的同现性以及关键词对文档的描述能力,计算出结果文档的信息面新颖度,最后结合新颖度和相关度对文档进行重新排序,从而给用户提供多样化的搜索结果.实验结果表明,文中方法的多样化性能优于现有的其它多样化方法.

本文引用格式

林古立 彭宏 马千里 韦佳 覃姜维 . 一种基于关键词的网页搜索结果多样化方法[J]. 华南理工大学学报(自然科学版), 2011 , 39(5) : 102 -107 . DOI: 10.3969/j.issn.1000-565X.2011.05.018

Abstract

The diversification of Web search results has been known as an important factor of improving Web search efficiency and user satisfaction. In this paper,the diversification problem is formalized into a maximization problem of facet coverage,and a novel diversification method named KDM is proposed. In KDM,first,keywords representing document facets are extracted from the retrieved documents related to the query. Then,the document facet novelty is calculated according to the co-occurrence and description ability of the keywords. Finally,the documents are re-ranked by considering both the novelty and the relevance to provide diversified search results for users. Experimental results indicate that KDM outperforms other existing approaches in terms of diversification ability.

文章导航

/