低空视角下改进无人机小目标检测算法研究

张杰​, 董春彤, 裴玉龙, 等

doi:10.12141/j.issn.1000-565X.250327

华南理工大学学报(自然科学版) >

0 1

DOI: https://doi.org/10.12141/j.issn.1000-565X.250327

低空交通系统

低空视角下改进无人机小目标检测算法研究

展开

1.宁德师范学院，信息工程学院，福建宁德 352100；

2.东北林业大学，土木与交通学院，黑龙江哈尔滨 150040；

3.兰州交通大学，交通运输学院，甘肃兰州 730070

网络出版日期: 2025-11-04

收起

Research on Improved Small-Object Detection Algorithm for UAVs from Low-Altitude Perspectives

Expand

1. College of Information Engineering, Ningde Normal University, Ningde 352100, China;

2. School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China;

3. School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, Gansu, China

Online published: 2025-11-04

Fold

摘要

为解决低空交通场景下无人机对高精度、轻量化目标检测的需求，本文基于RT-DETR算法，提出一种改进的跨尺度对齐与位置编码增强检测模型（CAPE-RT-DETR）。首先，设计了融合动态卷积核生成与门控特征选择的特征增强模块，强化关键特征的提取与筛选；其次，将可学习位置编码与多头注意力机制相结合，构建增强型位置感知交互模块，有效提升模型对空间结构的感知与定位精度；最后，引入金字塔场景解析模块以整合稀疏全局上下文信息，并采用双路卷积与网格采样机制显式补偿跨尺度特征对齐偏差，避免因上采样操作导致的错位问题，从而进一步提升检测性能。在ALU与VisDrone2019两个数据集上的实验结果表明，CAPE-RT-DETR在参数量、精度与模型大小等方面均优于基线算法，同时消融实验验证了三个改进模块的有效性与协同性。本文为复杂场景下无人机实时目标检测提供一种高精度、轻量化的方法基础与理论支持。

关键词： 低空交通; 目标检测; 无人机; RT-DETR

本文引用格式

张杰, 董春彤, 裴玉龙, 等 . 低空视角下改进无人机小目标检测算法研究[J]. 华南理工大学学报(自然科学版), 0 : 1 . DOI: 10.12141/j.issn.1000-565X.250327

Abstract

To address the demand for high-precision and lightweight object detection in low-altitude traffic scenarios with unmanned aerial vehicles (UAVs), this study proposes an improved detection model, namely the Cross-scale Alignment and Positional Encoding-enhanced RT-DETR (CAPE-RT-DETR), based on the Real-Time Detection Transformer (RT-DETR). First, a feature enhancement module integrating dynamic convolution kernel generation and gated feature selection is designed to strengthen the extraction and filtering of key features. Second, a position-aware interaction module is constructed by combining learnable positional encoding with the multi-head attention mechanism, which effectively enhances the model’s perception of spatial structures and localization accuracy. Finally, a pyramid scene parsing structure is introduced to integrate sparse global contextual information, while a dual convolution and grid sampling mechanism is employed to explicitly compensate for cross-scale feature alignment deviations, thereby avoiding misalignment caused by upsampling operations and further improving detection performance. Experiments conducted on the ALU and VisDrone2019 datasets, with comparisons against 18 state-of-the-art detection methods, demonstrate that CAPE-RT-DETR outperforms the baseline in terms of parameter efficiency, accuracy, and model size. In addition, ablation studies validate the effectiveness and complementarity of the three proposed modules. This work provides a high-precision and lightweight algorithmic foundation and theoretical support for real-time UAV object detection in complex scenarios.

Key words： low-altitude traffic; object detection; unmanned aerial vehicle; RT-DETR

Options

摘要页面

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract