Journal of South China University of Technology(Natural Science Edition) ›› 2025, Vol. 53 ›› Issue (9): 11-21.doi: 10.12141/j.issn.1000-565X.240088
• Computer Science & Technology • Previous Articles Next Articles
LIU Huiting1,2, LIU Shaoxiong1, WANG Jiale3, ZHAO Peng1
Received:2024-02-27
Online:2025-09-25
Published:2025-04-27
About author:刘慧婷(1978—),女,博士,副教授,主要从事自然语言处理和个性化推荐研究。E-mail: htliu@ahu.edu.cn
Supported by:CLC Number:
LIU Huiting, LIU Shaoxiong, WANG Jiale, ZHAO Peng. Deep Reinforcement Learning Recommendation Model Based on Multi-Interest Contrast[J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(9): 11-21.
Table 2
Cumulative precision of models on three datasets"
| 模型 | MovieLens 1M上的累积准确率 | EachMovie上的累积准确率 | Amazon上的累积准确率 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| T = 5 | T = 10 | T = 20 | T = 5 | T = 10 | T = 20 | T = 5 | T = 10 | T = 20 | |
| Random | 0.215 0 | 0.440 0 | 0.898 3 | 0.060 0 | 0.118 0 | 0.250 0 | 0.040 0 | 0.080 0 | 0.140 0 |
| Pop | 0.620 5 | 1.024 0 | 1.680 0 | 1.060 5 | 2.272 9 | 4.416 4 | 0.194 0 | 0.393 9 | 0.755 0 |
| BPR | 1.044 2 | 2.202 4 | 4.488 8 | 0.985 1 | 2.118 8 | 4.009 9 | 0.177 3 | 0.345 5 | 0.657 6 |
| ϵ-Greedy | 2.009 9 | 3.818 2 | 7.034 7 | 1.604 4 | 2.755 5 | 4.473 8 | 0.264 2 | 0.476 5 | 0.828 2 |
| DQNR | 2.005 0 | 3.856 2 | 7.059 5 | 1.561 8 | 2.840 0 | 4.326 6 | 0.260 9 | 0.452 2 | 0.910 9 |
| NICF | 2.054 5 | 3.760 3 | 6.963 6 | 1.570 0 | 2.734 9 | 4.590 0 | 0.234 0 | 0.299 0 | 0.329 0 |
| SGL | 1.661 2 | 3.119 0 | 4.968 6 | 1.336 2 | 2.176 4 | 2.668 5 | 0.179 9 | 0.236 6 | 0.337 1 |
| GreedyRM | 2.041 3 | 3.876 0 | 7.132 2 | 1.612 0 | 2.826 8 | 4.718 1 | 0.319 3 | 0.494 3 | 0.852 5 |
| MOCIR | 2.091 1 | 3.900 7 | 7.203 6 | 1.629 5 | 2.867 0 | 4.801 9 | 0.332 3 | 0.517 0 | 0.944 9 |
Table 3
Cumulative recall of models on three datasets"
| 模型 | MovieLens 1M上的累积召回率 | EachMovie上的累积召回率 | Amazon上的累积召回率 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| T = 5 | T = 10 | T = 20 | T = 5 | T = 10 | T = 20 | T = 5 | T = 10 | T = 20 | |
| Random | 0.001 1 | 0.002 7 | 0.005 1 | 0.002 6 | 0.005 5 | 0.013 0 | 0.000 8 | 0.002 8 | 0.005 6 |
| Pop | 0.026 5 | 0.046 1 | 0.082 2 | 0.109 4 | 0.207 8 | 0.226 5 | 0.006 2 | 0.018 0 | 0.035 7 |
| BPR | 0.009 6 | 0.018 5 | 0.036 6 | 0.066 6 | 0.101 4 | 0.153 7 | 0.006 8 | 0.011 8 | 0.020 3 |
| ϵ-Greedy | 0.033 2 | 0.061 3 | 0.102 8 | 0.129 2 | 0.194 5 | 0.335 5 | 0.011 2 | 0.021 1 | 0.037 9 |
| DQNR | 0.032 6 | 0.059 2 | 0.106 6 | 0.129 5 | 0.209 3 | 0.313 9 | 0.011 0 | 0.021 6 | 0.036 4 |
| NICF | 0.020 9 | 0.038 3 | 0.067 8 | 0.080 1 | 0.126 0 | 0.182 0 | 0.009 0 | 0.016 2 | 0.024 6 |
| SGL | 0.022 4 | 0.041 3 | 0.064 5 | 0.120 4 | 0.164 3 | 0.195 7 | 0.007 7 | 0.009 8 | 0.013 7 |
| GreedyRM | 0.034 3 | 0.061 5 | 0.107 6 | 0.138 9 | 0.217 2 | 0.348 1 | 0.013 0 | 0.021 2 | 0.036 3 |
| MOCIR | 0.035 2 | 0.061 8 | 0.110 0 | 0.146 4 | 0.227 6 | 0.354 4 | 0.014 0 | 0.021 7 | 0.039 3 |
| [1] | 马晓亮,高洁,刘英,等 .基于意图理解驱动的客服知识推荐大模型构建[J].华南理工大学学报(自然科学版),2025,53(3):40-49. |
| MA Xiaoliang, GAO Jie, LIU Ying,et al .Customer Service knowledge recommendation large model construction driven by intent understanding[J].Journal of South China University of Technology (Natural Science Edition),2025,53(3):40-49. | |
| [2] | LEE H, HWANG D, MIN K,et al .Towards validating long-term user feedbacks in interactive recommendation systems[C]∥ Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval.Madrid:ACM,2022:2607-2611. |
| [3] | LI L H, CHU W, LANGFORD J,et al .A contextual-bandit approach to personalized news article recommendation[C]∥ Proceedings of the 19th International Conference on World Wide Web.Raleigh:ACM,2010:661-670. |
| [4] | CHAPELLE O, LI L H .An empirical evaluation of thompson sampling[J].Advances in Neural Information Processing Systems,2011:2249-2257. |
| [5] | ZHAO X X, ZHANG W N, WANG J .Interactive collaborative filtering[C]∥ Proceedings of the 22nd ACM International Conference on Information & Knowledge Management.San Francisco:ACM,2013:1411-1420. |
| [6] | CHEN H K, DAI X Y, CAI H,et al .Large-scale interactive recommendation with tree-structured policy gradient[C]∥ Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence.Honolulu:AAAI,2019:3312-3320. |
| [7] | CHEN M M, BEUTEL A, COVINGTON P,et al .Top-k off-policy correction for a REINFORCE recommender system[C]∥ Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining.Melbourne:ACM,2019:456-464. |
| [8] | LIU F, TANG R M, LI X T,et al .Deep reinforcement learning based recommendation with explicit user-item interactions modeling[EB/OL].(2019-10-29)[2024-03-01].. |
| [9] | ZOU L X, XIA L, GU Y L,et al .Neural interactive collaborative filtering[C]∥ Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2020:749-758. |
| [10] | ZHAO X Y, ZHANG L, DING Z Y,et al .Recommendations with negative feedback via pairwise deep reinforcement learning[C]∥ Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.London:ACM,2018:1040-1048. |
| [11] | ZHOU S J, DAI X Y, CHEN H N,et al .Interactive recommender system via knowledge graph-enhanced reinforcement learning[C]∥ Proceedings of the 43rd International ACM SIGIR Conference on Research and Development In Information Retrieval.New York:ACM,2020:179-188. |
| [12] | CHEN T, KORNBLITH S, NOROUZI M,et al .A simple framework for contrastive learning of visual representations[C]∥ Proceedings of the 37th International Conference on Machine Learning.[S.l.]:ML Research Press,2020:1597-1607. |
| [13] | HE K M, FAN H Q, WU Y X,et al .Momentum contrast for unsupervised visual representation learning [C]∥ Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:9729-9738. |
| [14] | CHENG C, YANG H, LYU M R,et al .Where you like to go next:successive point-of-interest recommendation[C]∥ Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence.Beijing:AAAI,2013:2605-2611. |
| [15] | JI J C, ZHANG B J, YU J C,et al .Relationship-aware contrastive learning for social recommendations [J].Information Sciences,2023,629:778-797. |
| [16] | XU C, ZHANG Y, CHEN H Y,et al .A fairness-aware graph contrastive learning recommender framework for social tagging systems[J].Information Sciences,2023,640:119064/1-14. |
| [17] | ZHOU K, WANG H, ZHAO W X,et al .S3-Rec:self-supervised learning for sequential recommendation with mutual information maximization[C]∥ Proceedings of the 29th ACM International Conference on Information & Knowledge Management.New York:ACM,2020: 1893-1902. |
| [18] | WU J C, WANG X, FENG F L,et al .Self-supervised graph learning for recommendation[C]∥ Proceedings of the 44th international ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2021:726-735. |
| [19] | XIE X, SUN F, LIU Z Y,et al .Contrastive learning for sequential recommendation[C]∥ Proceedings of 2022 IEEE the 38th International Conference on Data Engineering.Kuala Lumpur:IEEE,2022:1259-1273. |
| [20] | PHAM P, NGUYEN L T T, NGUYEN N T,et al .A hierarchical fused fuzzy deep neural network with heterogeneous network embedding for recommendation [J].Information Sciences,2023,620:105-124. |
| [21] | SHI C, LI Y T, ZHANG J W,et al .A survey of heterogeneous information network analysis[J].IEEE Transactions on Knowledge and Data Engineering,2016,29(1):17-37. |
| [22] | FENG W, WANG J Y .Incorporating heterogeneous information for personalized tag recommendation in social tagging systems[C]∥ Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Beijing:ACM,2012:1276-1284. |
| [23] | 蔡晓东,曾志杨 .AFGSRec:一种自适应融合全局协同特征的社交推荐模型[J].华南理工大学学报(自然科学版),2022,50(12):71-79. |
| CAI Xiaodong, ZENG Zhiyang .AFGSRec:a social recommendation model based on adaptive fusion of global collaborative features[J].Journal of South China University of Technology (Natural Science Edition),2022,50(12):71-79. | |
| [24] | SHI C, ZHANG Z Q, LUO P,et al .Semantic path based personalized recommendation on weighted heterogeneous information networks[C]∥ Proceedings of the 24th ACM International on Conference on Information and Knowledge Management.Melbourne:ACM,2015:453-462. |
| [25] | FRANÇOIS-LAVET V, HENDERSON P, ISLAM R,et al .An introduction to deep reinforcement learning[J].Foundations and Trends in Machine Learning,2018,11(3/4):219-354. |
| [26] | HE X N, LIAO L Z, ZHANG H W,et al .Neural collaborative filtering[C]∥ Proceedings of the 26th International Conference on World Wide Web.Perth:ACM,2017:173-182. |
| [27] | RENDLE S, FREUDENTHALER C, GANTNER Z,et al .BPR:Bayesian personalized ranking from implicit feedback[C]∥ Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence.Montreal:AUAI Press,2009:452-461. |
| [28] | KINGMA D P, BA J .Adam:a method for stochastic optimization[EB/OL].(2017-01-30)[2024-03-01].. |
| [1] | LU Lu, WAN Tong. A Method for Software Vulnerability Detection via Path Representations and Pretrained Model [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(5): 56-65. |
| [2] | CAI Xiaodong, DONG Lifang, HUANG Yeyang, ZHOU Li. Contrastive Learning Model Based on Text-Visual and Information Entropy Minimization [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(3): 50-56. |
| [3] | CHENG Xiaohua, WANG Zefu, ZENG Jun, et al. Distributed Energy Cluster Scheduling Method Based on EA-RL Algorithm [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(1): 1-9. |
| [4] | ZHOU Xuan, MO Haohua, YAN Junwei. Investigating an Enhanced H-AC Algorithm-Based Strategy for Energy-Saving Optimization Control in Cold Source System [J]. Journal of South China University of Technology(Natural Science Edition), 2025, 53(1): 21-31. |
| [5] | LUO Yutao, XUE Zhicheng. Multi-Task Assisted Driving Policy Learning Method for Autonomous Driving [J]. Journal of South China University of Technology(Natural Science Edition), 2024, 52(10): 31-40. |
| [6] | WANG Fujian, CHENG Huiling, MA Dongfang, et al. Reconstruction of Urban Vehicle Path Chain Based on Deep Inverse Reinforcement Learning [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(7): 120-128. |
| [7] | YE Feng, CHEN Biao, LAI Yizong. Contrastive Knowledge Distillation Method Based on Feature Space Embedding [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 13-23. |
| [8] | CHEN Feng, MAO Haobin, CAI Jiling, et al.. Multidimensional Cross-Layer Bandwidth Prediction for Low-Latency Real-Time Video [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(11): 18-27. |
| [9] | XU Lunhui, YU Jiaxin, PEI Mingyang, et al. Repositioning Strategy for Ride-Hailing Vehicles Based on Geometric Road Network Structure and Reinforcement Learning [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(10): 99-109. |
| [10] | WANG Gao, CHEN Xiaohong, LIU Ning, et al. A Robot Grasping Policy Based on Viewpoint Selection Experience Enhancement Algorithm [J]. Journal of South China University of Technology(Natural Science Edition), 2022, 50(9): 126-137. |
| [11] | LIU Huiting, LI Yinjie, GUO Lingling, et al. Tightly Coupled Recommendation Algorithm Based on Heterogeneous Information Networks [J]. Journal of South China University of Technology (Natural Science Edition), 2021, 49(7): 66-75. |
| [12] |
YAN Junwei HUANG Qi ZHOU Xuan .
Energy-saving Optimization Operation of Central Air-conditioning System Based on Double-DQN Algorithm
|
| [13] | Xu Yu- bin Chen Jia- mei Ma Lin. Q- Learning- Based Network Selection Strategy for Access Control in WLAN/WIMAX [J]. Journal of South China University of Technology (Natural Science Edition), 2013, 41(8): 41-46,60. |
| [14] | Hao Chuan-chuan Fang Zhou Li Ping. Efficient Reinforcement-Learning Control Algorithm Using Experience Reuse [J]. Journal of South China University of Technology(Natural Science Edition), 2012, 40(6): 70-75. |
| [15] | Yu Tao Hu Xi-bing Liu Jing. Multi-Objective Optimal Power Flow Calculation Based on Multi-Step Q(λ) Learning Algorithm [J]. Journal of South China University of Technology (Natural Science Edition), 2010, 38(10): 139-145. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||