基于改进Q-Learning算法的复杂风场中无人机三维路径规划方法

王海波, 王子豪, 程华镇, 等

doi:10.12141/j.issn.1000-565X.260008

华南理工大学学报(自然科学版) >

0 1

DOI: https://doi.org/10.12141/j.issn.1000-565X.260008

青年编委学术成果专辑

基于改进Q-Learning算法的复杂风场中无人机三维路径规划方法

展开

1. 广东工业大学土木与交通工程学院，广东广州 510006；

2. 广东振业优控科技股份有限公司，广东中山 528400；

3. 广东工业大学自动化学院，广东广州 510006

网络出版日期: 2026-02-10

收起

Three-Dimensional Path Planning for UAV in Complex Wind Fields Based on an Improved Q-Learning Algorithm

Expand

1. School of Civil and Transportation Engineering, Guangdong University of Technology, Guangzhou 510006, Guangdong, China;

2. Zhenye Uctrl Technology Corp. Ltd., Zhongshan 528400, Guangdong, China;

3. School of Automation, Guangdong University of Technology, Guangzhou 510006, Guangdong, China

Online published: 2026-02-10

Fold

摘要

针对无人机在复杂三维风场环境中的路径规划问题，本文提出了一种改进的Q-Learning路径规划算法。研究构建一个综合奖励函数的分层复合机制，该机制考虑了风速、风向、路径平滑度、高度及边界限制等因素，以引导智能体在飞行距离与环境代价之间寻求全局最优解。在六组具有不同拓扑特征的风场场景中进行了仿真验证。结果表明，与传统算法相比，尽管该算法规划出的路径几何长度略有增加，但累计风速消耗显著降低，有效规避了强逆风和高风速区域。同时，所规划路径的平滑度大幅提升，轨迹拐点数量较传统算法减少，更符合无人机运动学约束。收敛性分析显示，该算法在约2000次迭代后即可实现稳定收敛。研究证实，该方法具备在非稳态风场中实现安全、高效且自适应路径规划的能力。

关键词： 三维路径规划; Q-Learning算法; 复杂风场; 分层复合机制; 多目标优化

本文引用格式

王海波, 王子豪, 程华镇, 等 . 基于改进Q-Learning算法的复杂风场中无人机三维路径规划方法[J]. 华南理工大学学报(自然科学版), 0 : 1 . DOI: 10.12141/j.issn.1000-565X.260008

Abstract

This study proposes an enhanced Q-Learning algorithm for UAV path planning in complex 3D wind fields. By designing a hierarchical composite reward function that integrates wind vectors, path smoothness, and spatial constraints, the algorithm optimizes the trade-off between flight distance and environmental costs. Validated across six diverse wind field scenarios, the results indicate that the proposed method significantly minimizes cumulative wind speed consumption and headwind exposure compared to conventional algorithms. While a slight increase in geometric length is observed, the trajectories feature significantly improved smoothness and reduced inflection points, aligning better with UAV kinematic constraints. The algorithm demonstrates stable convergence within 2,000 iterations. The results confirm the robustness and adaptability of this method for safe and efficient UAV autonomous navigation in dynamic wind environments.

Key words： 3D path planning; Q-Learning algorithm; complex wind fields; hierarchical composite mechanism; multi-objective optimization

Options

摘要页面

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract