Journal of South China University of Technology (Natural Science Edition) ›› 2009, Vol. 37 ›› Issue (1): 135-139.

• Computer Science & Technology • Previous Articles     Next Articles

Design and Implementation of High-Performance FTP Search Engine

Guo Li-li1  Zhao Chun-jiang2   

  1. 1. College of Engineering, China Agricultural University, Beijing 100083, China; 2. National Engineering Research Center for Information Technology in Agriculture, Beijing 100037, China
  • Received:2007-11-12 Revised:2008-03-04 Online:2009-01-25 Published:2009-01-25
  • Contact: 赵春江(1964-),男,研究员,主要从事农业信息化技术与精准农业技术体系研究.E-mail:zhaocj@nercita.org.cn E-mail:holypku@163.com
  • About author:郭立力(1971-),男,博士,主要从事大规模搜索引擎、用户个性化服务等研究.
  • Supported by:

    国家“863”计划项目(2006AA10Z239);国家科技支撑计划项目(2006BAH02A16)

Abstract:

In order to improve the query quality of the traditional FTP search engines possessing low optimization performance for query results, a high-performance intelligent FTP search engine is designed based on the statistical analysis of FTP user query logs. In this engine, the double-byte inverted index is employed to build an inverted index table with every double bytes of the file name, the automatic classification of query results is used to establish a tree structure of query results based on the search topic, and the automatic error correction is adopted to construct a spelling mistake dictionary with the high-frequency search keywords in user query logs. Query results in a real system indicate that the proposed scheme greatly improves the query efficiency and quality of a FTP search engine.

Key words: File Transfer Protocol, search engine, inverted index, automatic classification, automatic error correction