Journal of South China University of Technology(Natural Science Edition) ›› 2025, Vol. 53 ›› Issue (9): 59-67.doi: 10.12141/j.issn.1000-565X.240499

• Computer Science & Technology • Previous Articles     Next Articles

Information Retrieval Re-ranking Method Based on Bidirectional Text Expansion

TU Xinhui  GUO Cong  ZONG Yuhang 

  

  1. School of Computer Science, Central China Normal University, Wuhan 430079, Hubei, China
  • Online:2025-09-25 Published:2025-01-17

Abstract: With the rapid development of large language models, text matching and characterization techniques in information retrieval have made significant progress. Among them, query expansion and document expansion are two important methods to enhance text representation in information retrieval. Currently the mainstream text expansion methods are all realized by using large language models. However, texts generated by big language models and manually created texts are very different in terms of linguistic diversity and style, and this difference may affect the calculation of query-document relevance, which ultimately leads to a decrease in the accuracy of the whole information retrieval process. In order to solve this problem, the paper proposes an information retrieval method based on bidirectional text expansion (BTE). First, a zero-sample cue is used to enable the large language model to generate pseudo-queries of documents and pseudo-documents of queries; then, the semantic similarity between pseudo-queries and pseudo-documents is computed; and finally, the original query-document similarity scores and pseudo-queries-pseudo-documents are weighted and fused together to obtain the final document ranking results. Experiments on two publicly available datasets, DL19 and DL20, demonstrate that the BTE method significantly outperforms the benchmark model in several evaluation metrics such as NDCG@10, P@10 and MRR@10. Therefore, the bidirectional text expansion method proposed in this paper can further enhance the relevance matching between the query and the document, thus producing a certain degree of improvement in the performance of the whole information retrieval system.

Key words:

information retrieval, large language model, query expansion, document expansion