低空视角下改进无人机小目标检测算法研究
1.宁德师范学院,信息工程学院,福建 宁德 352100;
2.东北林业大学,土木与交通学院,黑龙江 哈尔滨 150040;
3.兰州交通大学,交通运输学院,甘肃 兰州 730070
网络出版日期: 2025-11-04
Research on Improved Small-Object Detection Algorithm for UAVs from Low-Altitude Perspectives
1. College of Information Engineering, Ningde Normal University, Ningde 352100, China;
2. School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China;
3. School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, Gansu, China
Online published: 2025-11-04
为解决低空交通场景下无人机对高精度、轻量化目标检测的需求,本文基于RT-DETR算法,提出一种改进的跨尺度对齐与位置编码增强检测模型(CAPE-RT-DETR)。首先,设计了融合动态卷积核生成与门控特征选择的特征增强模块,强化关键特征的提取与筛选;其次,将可学习位置编码与多头注意力机制相结合,构建增强型位置感知交互模块,有效提升模型对空间结构的感知与定位精度;最后,引入金字塔场景解析模块以整合稀疏全局上下文信息,并采用双路卷积与网格采样机制显式补偿跨尺度特征对齐偏差,避免因上采样操作导致的错位问题,从而进一步提升检测性能。在ALU与VisDrone2019两个数据集上的实验结果表明,CAPE-RT-DETR在参数量、精度与模型大小等方面均优于基线算法,同时消融实验验证了三个改进模块的有效性与协同性。本文为复杂场景下无人机实时目标检测提供一种高精度、轻量化的方法基础与理论支持。
张杰, 董春彤, 裴玉龙, 等 . 低空视角下改进无人机小目标检测算法研究[J]. 华南理工大学学报(自然科学版), 0 : 1 . DOI: 10.12141/j.issn.1000-565X.250327
To address the demand for high-precision and lightweight object detection in low-altitude traffic scenarios with unmanned aerial vehicles (UAVs), this study proposes an improved detection model, namely the Cross-scale Alignment and Positional Encoding-enhanced RT-DETR (CAPE-RT-DETR), based on the Real-Time Detection Transformer (RT-DETR). First, a feature enhancement module integrating dynamic convolution kernel generation and gated feature selection is designed to strengthen the extraction and filtering of key features. Second, a position-aware interaction module is constructed by combining learnable positional encoding with the multi-head attention mechanism, which effectively enhances the model’s perception of spatial structures and localization accuracy. Finally, a pyramid scene parsing structure is introduced to integrate sparse global contextual information, while a dual convolution and grid sampling mechanism is employed to explicitly compensate for cross-scale feature alignment deviations, thereby avoiding misalignment caused by upsampling operations and further improving detection performance. Experiments conducted on the ALU and VisDrone2019 datasets, with comparisons against 18 state-of-the-art detection methods, demonstrate that CAPE-RT-DETR outperforms the baseline in terms of parameter efficiency, accuracy, and model size. In addition, ablation studies validate the effectiveness and complementarity of the three proposed modules. This work provides a high-precision and lightweight algorithmic foundation and theoretical support for real-time UAV object detection in complex scenarios.
Key words: low-altitude traffic; object detection; unmanned aerial vehicle; RT-DETR
/
| 〈 |
|
〉 |