Text Semantic Similarity Model Based on Ranking Distillation and Difference Prediction

doi:10.12141/j.issn.1000-565X.250448

Abstract

Abstract:

In text semantic similarity models based on unsupervised contrastive learning, existing approaches often simply divide texts into positive and negative samples, while the model training process only focuses on the overall features of the text. This design has obvious limitations: on one hand, it ignores the fine-grained ranking features between texts, making it difficult to differentiate gradient differences in similarity; on the other hand, the model is insensitive to semantic changes between sentences, resulting in an inability to accurately capture the similarity between texts. To explore fine-grained relationships between samples and enhance the model's ability to perceive semantic changes, this paper proposes a text semantic similarity model based on ranking distillation and difference prediction. First, coarse-grained ranking features are extracted from a pre-trained teacher model and distilled into the student model, enabling it to capture fine-grained ranking features. Second, a difference prediction auxiliary network is designed: the original text is first randomly masked to obtain masked text, then a generator produces reconstructed text, and finally a discriminator predicts the differences between the original text and the reconstructed text, allowing the model to perceive semantic changes between the original and masked texts. Experimental results show that on the text semantic similarity task datasets STS12-STS16, STS-B, and SICK-R, the Spearman correlation coefficient improved on average by 1.16% and 0.82% over the Bert-base and Roberta-base foundations, respectively, compared to advanced models, demonstrating the effectiveness of this model.

Key words: deep learning, semantic similarity, contrastive learning, distillation learning

CAI Xiaodong, TAN Yuanhao. Text Semantic Similarity Model Based on Ranking Distillation and Difference Prediction[J]. Journal of South China University of Technology(Natural Science Edition), doi: 10.12141/j.issn.1000-565X.250448.

[1]	YANG Junmei, ZHANG Bangcheng, YANG Lu, et al. A Single-Channel Speech Separation Model Based on Time-Domain Comprehensive Attention Mechanism [J]. Journal of South China University of Technology(Natural Science Edition), 2026, 54(1): 70-82.
[2]	LIU Huiting, LIU Shaoxiong, WANG Jiale, ZHAO Peng. Deep Reinforcement Learning Recommendation Model Based on Multi-Interest Contrast [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(9): 11-21.
[3]	LU Lu, WAN Tong. A Method for Software Vulnerability Detection via Path Representations and Pretrained Model [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(5): 56-65.
[4]	CAI Xiaodong, DONG Lifang, HUANG Yeyang, ZHOU Li. Contrastive Learning Model Based on Text-Visual and Information Entropy Minimization [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(3): 50-56.
[5]	YE Feng, CHEN Biao, LAI Yizong. Contrastive Knowledge Distillation Method Based on Feature Space Embedding [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 13-23.
[6]	HUANG Jiang-ping JI Dong-hong. Convolutional Network-Based Semantic Similarity Model of Sentences [J]. Journal of South China University of Technology (Natural Science Edition), 2017, 45(3): 68-75.

Text Semantic Similarity Model Based on Ranking Distillation and Difference Prediction

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 6

Recommended Articles

Metrics

Comments