Journal of South China University of Technology(Natural Science Edition) ›› 2025, Vol. 53 ›› Issue (9): 11-21.doi: 10.12141/j.issn.1000-565X.240088

• Computer Science & Technology • Previous Articles     Next Articles

Deep Reinforcement Learning Recommendation Model Based on Multi-Interest Contrast

LIU Huiting1,2, LIU Shaoxiong1, WANG Jiale3, ZHAO Peng1   

  1. 1.School of Computer Science and Technology,Anhui University,Hefei 230601,Anhui,China
    2.Institute of Artificial Intelligence,Hefei Comprehensive National Science Center,Hefei 230088,Anhui,China
    3.Stony Brook Institute,Anhui University,Hefei 230039,Anhui,China
  • Received:2024-02-27 Online:2025-09-25 Published:2025-04-27
  • About author:刘慧婷(1978—),女,博士,副教授,主要从事自然语言处理和个性化推荐研究。E-mail: htliu@ahu.edu.cn
  • Supported by:
    the National Natural Science Foundation of China(62576003);the University Synergy Innovation Program of Anhui Province(GXXT-2022-040);the Natural Science Foundation of Anhui Province(2008085MF219);the Provincial Natural Science Foundation of Anhui Higher Education Institution of China(KJ2021-A0040)

Abstract:

Deep Reinforcement Learning (DRL) is widely applied in recommender systems to dynamically model user interests and maximize cumulative user benefits. However, the sparsity of user feedback has become a significant challenge for DRL-based recommendation algorithms. Contrastive learning, as a self-supervised learning method, enhances user interest representation by constructing multiple perspectives, thereby alleviating the issue of sparse user feedback. Existing contrastive learning methods typically rely on heuristic-based augmentation strategies, which often lead to the loss of key information and fail to fully utilize heterogeneous interaction data. To address these issues, this paper proposed a multi-interest oriented contrastive deep reinforcement learning recommendation (MOCIR) model. The model consists of two key modules: a contrastive representation module and a policy network module. The contrastive representation module utilizes a Heterogeneous Information Network (HIN) to model the user’s local interests from different aspects while capturing their global interests based on raw interaction data. It then treats the global and local interests of the same user as positive pairs and those of different users as negative pairs for contrastive learning, effectively enhancing user interest representation. The policy network module aggregates user state representations and generates recommendations. The two modules are trained using an alternating update mechanism. Experimental results on three benchmark datasets show that the proposed model outperforms several DRL-based models in recommendation performance, effectively addressing the problem of sparse user feedback in recommendations.

Key words: multi-interest, reinforcement learning, contrastive learning, heterogeneous information network

CLC Number: