Journal of South China University of Technology(Natural Science Edition) ›› 2024, Vol. 52 ›› Issue (10): 31-40.doi: 10.12141/j.issn.1000-565X.230503

• Computer Science & Technology • Previous Articles     Next Articles

Multi-Task Assisted Driving Policy Learning Method for Autonomous Driving

LUO Yutao(), XUE Zhicheng   

  1. School of Mechanical and Automotive Engineering/ Guangdong Provincial Key Laboratory of Automotive Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2023-08-01 Online:2024-10-25 Published:2024-01-31
  • About author:罗玉涛(1972—),男,博士,教授,主要从事无人驾驶汽车和新能源汽车研究。E-mail: ctytluo@scut.edu.cn
  • Supported by:
    the Special Fund for High-Quality Development of the Manufacturing Industry of the Ministry of Industry and Information Technology(R-ZH-023-QT-001-20221009-001)

Abstract:

With the development of autonomous driving technology, deep reinforcement learning has become an important means to realize the efficient driving policy learning. However, the implementation of autonomous driving is faced with the challenges brought by the complex and changeable traffic scenes, and the existing deep reinforcement learning methods have the problems of single scene adaptation ability and slow convergence speed. To address these issues and to improve the scene adaptability and policy learning efficiency of autonomous vehicles, this paper proposed a multi-task assisted driving policy learning method. Firstly, this method constructed the encoder-multi-task decoder module based on the deep residual network, squeezing high-dimensional driving scenes into low-dimensional representations, and adopted multi-task-assisted learning of semantic segmentation, depth estimation and speed prediction to improve the scene information richness of low-dimensional representations. Then, the low-dimensional representation was used as the state input to build a decision network based on reinforcement learning, and the multi-constraint reward function was designed to guide the learning of driving strategies. Finally, simulation experiments were conducted in CARLA. The experimental results show that, compared to classic methods such as DDPG and TD3, the proposed method improves the training process through multi-task assistance and learns better driving policies. It achieves higher task success rates and driving scores in several typical urban driving scenarios such as roundabouts and intersections, demonstrating excellent decision-making capabilities and scene adaptability.

Key words: end-to-end autonomous driving, reinforcement learning, multi-task learning, driving policy, decision-making

CLC Number: