Multi-Interest Oriented Contrastive for Deep Reinforcement Learning-based Recommendation

doi:10.12141/j.issn.1000-565X.240088

Journal of South China University of Technology(Natural Science Edition) ›› 2025, Vol. 53 ›› Issue (9): 11-21.doi: 10.12141/j.issn.1000-565X.240088

• Computer Science & Technology • Previous Articles Next Articles

Multi-Interest Oriented Contrastive for Deep Reinforcement Learning-based Recommendation

LIU Huiting^1,2 LIU Shaoxiong¹ WANG Jiale^1,3 ZHAO Peng¹

1. School of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, China;

2. Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei 230088, Anhui, China;

3. Stony Brook Institute, Anhui University, Hefei 230039, Anhui, China

Online:2025-09-25 Published:2025-04-27

Abstract

Abstract:

Deep reinforcement learning (DRL) algorithms have been incorporated into recommendation settings for dynamically modeling the interests of users and maximizing cumulative rewards. However, data sparsity poses a challenge to most DRL-based interactive recommendation methods. A good view is that contrastive learning can address the data sparsity problem. However, most existing contrastive learning methods typically exploit heuristic-based augmentation strategies, resulting in the loss of critical information and failure to make full use of heterogeneous information. To address these shortcomings, we propose multi-interest oriented contrastive for deep reinforcement learning-based recommendations (MOCIR). In particular, we utilize heterogeneous information networks (HINs) to model different aspects of a user’s local interests and utilize original data to model the user’s global interests. The proposed method comprises a contrastive learning module and policy network. The contrastive learning module uses metapaths in the HINs to find neighbors for items with different aspects, aggregates them to obtain item representations, and then utilizes both the global and local interests as positive pairs for contrastive learning, thereby effectively capturing the user’s interest. A policy network is used to make recommendations after the user state representation is aggregated, and the contrastive learning module and the policy network are jointly updated. The proposed method is superior to its state-of-the-art counterparts, as demonstrated by experiments on three benchmark datasets.

Key words: multi-interest, reinforcement learning, contrastive learning, heterogeneous information network

LIU Huiting, LIU Shaoxiong, WANG Jiale, et al. Multi-Interest Oriented Contrastive for Deep Reinforcement Learning-based Recommendation[J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(9): 11-21.

[1]	LU Lu, WAN Tong. A Method for Software Vulnerability Detection via Path Representations and Pretrained Model [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(5): 56-65.
[2]	CAI Xiaodong, DONG Lifang, HUANG Yeyang, ZHOU Li. Contrastive Learning Model Based on Text-Visual and Information Entropy Minimization [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(3): 50-56.
[3]	CHENG Xiaohua, WANG Zefu, ZENG Jun, et al. Distributed Energy Cluster Scheduling Method Based on EA-RL Algorithm [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(1): 1-9.
[4]	ZHOU Xuan, MO Haohua, YAN Junwei. Investigating an Enhanced H-AC Algorithm-Based Strategy for Energy-Saving Optimization Control in Cold Source System [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(1): 21-31.
[5]	LUO Yutao, XUE Zhicheng. Multi-Task Assisted Driving Policy Learning Method for Autonomous Driving [J]. Journal of South China University of Technology(Natural Science Edition), 2024, 52(10): 31-40.
[6]	WANG Fujian, CHENG Huiling, MA Dongfang, et al. Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(7): 120-128.
[7]	YE Feng, CHEN Biao, LAI Yizong. Contrastive Knowledge Distillation Method Based on Feature Space Embedding [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 13-23.
[8]	CHEN Feng, MAO Haobin, CAI Jiling, et al.. Multidimensional Cross-Layer Bandwidth Prediction for Low-Latency Real-Time Video [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(11): 18-27.
[9]	XU Lunhui, YU Jiaxin, PEI Mingyang, et al. Repositioning Strategy for Ride-Hailing Vehicles Based on Geometric Road Network Structure and Reinforcement Learning [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(10): 99-109.
[10]	WANG Gao, CHEN Xiaohong, LIU Ning, et al. A Robot Grasping Policy Based on Viewpoint Selection Experience Enhancement Algorithm [J]. Journal of South China University of Technology(Natural Science Edition), 2022, 50(9): 126-137.
[11]	YAN Junwei HUANG Qi ZHOU Xuan . Energy-saving Optimization Operation of Central Air-conditioning System Based on Double-DQN Algorithm [J]. Journal of South China University of Technology (Natural Science Edition), 2019, 47(1): 135-144.
[12]	Xu Yu- bin Chen Jia- mei Ma Lin. Q- Learning- Based Network Selection Strategy for Access Control in WLAN/WIMAX [J]. Journal of South China University of Technology (Natural Science Edition), 2013, 41(8): 41-46,60.
[13]	Hao Chuan-chuan Fang Zhou Li Ping. Efficient Reinforcement-Learning Control Algorithm Using Experience Reuse [J]. Journal of South China University of Technology(Natural Science Edition), 2012, 40(6): 70-75.
[14]	Yu Tao Hu Xi-bing Liu Jing. Multi-Objective Optimal Power Flow Calculation Based on Multi-Step Q（λ） Learning Algorithm [J]. Journal of South China University of Technology (Natural Science Edition), 2010, 38(10): 139-145.
[15]	Bian Jian-yong Xu Jian-min Pei Hai-long . Video Vehicle Tracking Based on Reinforcement Learning [J]. Journal of South China University of Technology (Natural Science Edition), 2008, 36(10): 57-60,66.

Multi-Interest Oriented Contrastive for Deep Reinforcement Learning-based Recommendation

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments