基于排名蒸馏与差异预测的文本语义相似模型

doi:10.12141/j.issn.1000-565X.250448

摘要/Abstract

摘要：

基于无监督对比学习的文本语义相似性模型中，现有方案多采用简单地将文本划分为正样本和负样本，同时模型学习过程仅围绕文本的整体特征展开。这一设计存在明显局限：一方面忽略了文本之间的细粒度排名特征，难以区分相似程度的梯度差异，另一方面模型在捕捉句间语义变化不敏感，导致无法准确捕捉文本之间的相似性。为了挖掘样本间细粒度关系，并增强模型的语义变化感知能力，本文提出了一种基于排名蒸馏与差异预测的文本语义相似模型。首先，从预训练的教师模型中提取粗粒度排名特征，并将这些信息蒸馏到学生模型中，从而使模型可以捕获到细粒度排名特征。其次，设计一个差异预测辅助网络，先对原始文本进行随机掩码处理得到掩码文本，再由生成器生成重构文本，最后由鉴别器预测哪些原始文本与重构文本的差异，从而使模型能够感知原始文本与掩码文本之间语义变化。实验结果表明，在文本语义相似性任务数据集STS12-STS16、STS-B、SICK-R上，Spearman相关系数相较于先进模型分别在Bert-base、Roberta-base基础之上平均提升1.16%、0.82%，证明了该模型的有效性。

关键词:

深度学习, 语义相似性, 对比学习, 蒸馏学习

Abstract:

In text semantic similarity models based on unsupervised contrastive learning, existing approaches often simply divide texts into positive and negative samples, while the model training process only focuses on the overall features of the text. This design has obvious limitations: on one hand, it ignores the fine-grained ranking features between texts, making it difficult to differentiate gradient differences in similarity; on the other hand, the model is insensitive to semantic changes between sentences, resulting in an inability to accurately capture the similarity between texts. To explore fine-grained relationships between samples and enhance the model's ability to perceive semantic changes, this paper proposes a text semantic similarity model based on ranking distillation and difference prediction. First, coarse-grained ranking features are extracted from a pre-trained teacher model and distilled into the student model, enabling it to capture fine-grained ranking features. Second, a difference prediction auxiliary network is designed: the original text is first randomly masked to obtain masked text, then a generator produces reconstructed text, and finally a discriminator predicts the differences between the original text and the reconstructed text, allowing the model to perceive semantic changes between the original and masked texts. Experimental results show that on the text semantic similarity task datasets STS12-STS16, STS-B, and SICK-R, the Spearman correlation coefficient improved on average by 1.16% and 0.82% over the Bert-base and Roberta-base foundations, respectively, compared to advanced models, demonstrating the effectiveness of this model.

Key words: deep learning, semantic similarity, contrastive learning, distillation learning

蔡晓东, 谭远浩. 基于排名蒸馏与差异预测的文本语义相似模型[J]. 华南理工大学学报(自然科学版), doi: 10.12141/j.issn.1000-565X.250448.

CAI Xiaodong, TAN Yuanhao. Text Semantic Similarity Model Based on Ranking Distillation and Difference Prediction[J]. Journal of South China University of Technology(Natural Science Edition), doi: 10.12141/j.issn.1000-565X.250448.

[1]	刘慧婷, 刘绍雄, 王佳乐, 赵鹏. 基于多兴趣对比的深度强化学习推荐模型[J]. 华南理工大学学报(自然科学版), 2025, 53(9): 11-21.
[2]	陆璐, 万童. 一种基于路径表示和预训练模型的软件漏洞检测方法[J]. 华南理工大学学报(自然科学版), 2025, 53(5): 56-65.
[3]	蔡晓东, 董丽芳, 黄业洋, 周丽. 基于文本-视觉和信息熵最小化的对比学习模型[J]. 华南理工大学学报(自然科学版), 2025, 53(3): 50-56.
[4]	叶峰, 陈彪, 赖乙宗. 基于特征空间嵌入的对比知识蒸馏算法[J]. 华南理工大学学报(自然科学版), 2023, 51(5): 13-23.
[5]	黄江平姬东鸿. 基于卷积网络的句子语义相似性模型[J]. 华南理工大学学报（自然科学版）, 2017, 45(3): 68-75.

基于排名蒸馏与差异预测的文本语义相似模型

Text Semantic Similarity Model Based on Ranking Distillation and Difference Prediction

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 5

编辑推荐

Metrics

本文评价