基于改进柱形特征编码的单阶段目标检测算法

罗玉涛; 毛浩杰

doi:10.12141/j.issn.1000-565X.240100

华南理工大学学报(自然科学版) >

2025 , Vol. 53 >Issue 3: 1 - 11

DOI: https://doi.org/10.12141/j.issn.1000-565X.240100

计算机科学与技术

基于改进柱形特征编码的单阶段目标检测算法

罗玉涛 ,
毛浩杰

展开

华南理工大学机械与汽车工程学院/广东省汽车工程重点实验室，广东广州 510640

罗玉涛（1972—），男，教授，博士生导师，主要从事无人驾驶汽车和新能源汽车研究。E-mail： ctytluo@scut.edu.cn

收稿日期: 2024-03-05

网络出版日期: 2024-04-26

基金资助

工信部制造业高质量发展专项(R-ZH-023-QT-001-20221009-001);广州市科技计划项目(2023B01J0016)

收起

Single-Stage Object Detection Algorithm with Enhanced Pillar Feature Encoding

LUO Yutao ,
MAO Haojie

Expand

School of Mechanical and Automotive Engineering/ Guangdong Provincial Key Laboratory of Automotive Engineering，South China University of Technology，Guangzhou 510640，Guangdong，China

罗玉涛（1972—），男，教授，博士生导师，主要从事无人驾驶汽车和新能源汽车研究。E-mail： ctytluo@scut.edu.cn

Received date: 2024-03-05

Online published: 2024-04-26

Supported by

the Special Fund for High-Quality Development of Manufacturing Industry,the Ministry of Industry and Information Technology of China(R-ZH-023-QT-001-20221009-001)

Fold

摘要

基于柱形（Pillar）的单阶段点云3维目标检测算法凭借其较高的运行效率，在工业界得到了广泛的关注和应用。但对点云柱形量化造成的点云3维特征细粒度信息损失，导致这类算法对稀疏点云小目标的检测能力较弱。尽管部分研究对此问题提出了应对方法，但通常以较高的检测时间成本或者大目标检测精度作为代价。为此，该文提出了一种基于改进柱形特征编码的柱形点云目标检测算法。首先，构建可实现柱形单元内部点云局部与全局特征相结合的柱形特征编码网络，用于增强柱形量化特征的表征能力；然后，设计一个由2维稀疏卷积块与特征融合网络相结合的主干网络，用于融合多尺度的高级抽象语义特征和低级细粒度空间特征，防止过度关注小尺寸特征而降低大目标的检测性能；最后，在KITTI自动驾驶数据集上进行训练和测试，并对实验结果进行了可视化和消融研究。结果显示：所提算法在KITTI数据集的中等难度下，多个类别的平均精度均值达63.54%、平均方向相似性均值达70.72%，平均检测帧速率达31.5 f/s；与PointPillars、TANet和PiFEnet算法相比，该文算法的平均精度均值分别提高了2.44、2.05和2.38个百分点，平均方向相似性均值分别提高了4.69、0.68和7.83个百分点，在同类算法的对比中表现出工程应用潜力。

关键词： 智能汽车; 3维目标检测; 点云; 柱形特征编码

本文引用格式

罗玉涛 , 毛浩杰 . 基于改进柱形特征编码的单阶段目标检测算法[J]. 华南理工大学学报(自然科学版), 2025 , 53(3) : 1 -11 . DOI: 10.12141/j.issn.1000-565X.240100

Abstract

Single-stage point cloud 3-dimensional object detection algorithms based on pillars have gained significant attention and widespread application in the industry due to their high operational efficiency. However, the loss of fine-grained information loss in 3-dimensional features of point clouds caused by pillar-based quantization results in weaker detection capabilities for small objects in sparse point clouds. Although some studies have proposed solutions to this problem, they often come at the cost of either greater detection time or compromised detection accuracy for large targets. For this reason, this paper proposed an enhanced pillar-based point cloud object detection algorithm with enhanced pillar feature encoding. Firstly, a pillar feature encoding network is constructed to combine local and global features of point clouds within pillar cells, enhancing the representation capability of pillar-quantized features. Then, a backbone network that combines 2-dimensional sparse convolutional blocks with a feature fusion network was designed to fuse multi-scale high-level abstract semantic features and low-level fine-grained spatial features, preventing excessive focus on small-size features and thus degrading the detection performance for large targets. Lastly, the model was trained and tested on the KITTI autonomous driving dataset, with experimental results visualized and ablation studies conducted. The results show that, the proposed algorithm, under the medium difficulty level of the KITTI dataset, has an average precision mean of 63.54% across multiple categories, an average orientation similarity mean of 70.72%, and an average detection frame rate of 31.5 f/s. Compared with the PointPillars, TANet, and PiFEnet, the average precision mean of the algorithm proposed in this paper has increased by 2.44, 2.05, and 2.38 percentage points respectively, and the average orientation similarity mean has increased by 4.69, 0.68, and 7.83 percentage points respectively, demonstrating potential for engineering applications in comparisons with similar algorithms.

Key words： intelligent vehicle; 3-dimensional object detection; point cloud; pillar feature encoding

参考文献

1	WU Y， WANG Y， ZHANG S，et al ．Deep 3D object detection networks using LiDAR data：a review［J］．IEEE Sensors Journal，2020，21（2）：1152-1171.
2	田晟，宋霖，赵凯龙．基于偏移注意力机制和多特征融合的点云分类［J］．华南理工大学学报（自然科学版），2024，52（1）：100-109.
	TIAN Sheng， SONG Lin， ZHAO Kailong ．Point cloud classification based on offset attention mechanism and multi-feature fusion［J］．Journal of South China University of Technology （Natural Science Edition）， 2024，52（1）：100-109.
3	ZENG Y， HU Y， LIU S，et al ．RT3D：real-time 3-D vehicle detection in LiDAR point cloud for autonomous driving［J］．IEEE Robotics and Automation Letters，2018，3（4）：3434-3440.
4	MEYER G P， LADDHA A，KEE E，et al ．LaserNet：an efficient probabilistic 3D object detector for autonomous driving［C］∥ Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition．Long Beach：IEEE，2019：12669-12678.
5	YUE Y， CAI Y， WANG D ．GridNet-3D：a novel real-time 3D object detection algorithm based on point cloud［J］．Chinese Journal of Electronics，2021，30（5）：931-939.
6	SHI S， WANG X， LI H ．PointRCNN：3D object proposal generation and detection from point cloud［C］∥ Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition．Long Beach：IEEE，2019：770-779.
7	YANG Z， SUN Y， LIU S，et al ．STD：sparse-to-dense 3D object detector for point cloud［C］∥ Proceedings of 2019 IEEE/CVF International Conference on Computer Vision．Seoul：IEEE，2019：1951-1960.
8	YANG Z， SUN Y， LIU S，et al ．3DSSD：point-based 3D single stage object detector［C］∥ Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition．Seattle：IEEE，2020：11037-11045.
9	ZAMANAKOS G， TSOCHATZIDIS L， AMANATIADIS A，et al ．A comprehensive survey of LiDAR-based 3D object detection methods with deep learning for autonomous driving［J］．Computers & Graphics，2021，99：153-181.
10	ZHANG Y F， HU Q Y， XU G Q，et al ．Not all points are equal：learning highly efficient point-based detectors for 3D LiDAR point clouds［C］∥ Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition．New Orlean：IEEE，2022：18931-18940.
11	ZHOU Y， TUZEL O ．VoxelNet：end-to-end learning for point cloud based 3D object detection［C］∥ Procee-dings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition．Salt Lake City：IEEE，2018：4490-4499.
12	YAN Y， MAO Y， LI B ．SECOND：sparsely embedded convolutional detection［J］．Sensors，2018，18：3337/1-17.
13	龚章鹏，王国业，于是．基于体素网络的道路场景多类目标识别算法［J］．汽车工程，2021，43（4）：469-477.
	GONG Zhangpeng， WANG Guoye， YU Shi ．The algorithm of multi-category object recognition in road scene based on voxel network［J］．Automotive Engineering，2021，43（4）：469-477.
14	LI Z.LiDAR-based 3D object detection for autonomous driving［C］∥ Proceedings of 2022 International Confe-rence on Image Processing，Computer Vision and Machine Learning．Xi’an：IEEE，2022：507-512.
15	LANG A H， VORA S， CAESAR H，et al ．PointPi-llars：fast encoders for object detection from point clouds［C］∥ Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition．Long Beach：IEEE，2019：12689-12697.
16	LIU Z， ZHAO X， HUANG T，et al ．TANet：robust 3D object detection from point clouds with triple attention［C］∥ Proceedings of the 34th AAAI Conference on Artificial Intelligence．New York：AAAI，2020：11677-11684.
17	YE M， XU S， CAO T ．HVNet：hybrid voxel network for LiDAR based 3D object detection［C］∥ Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition．Seattle：IEEE，2020：1628-1637.
18	LE D T， SHI H， REZATOFIGHI H，et al ．Accurate and real-time 3D pedestrian detection using an efficient attentive pillar network［J］．IEEE Robotics and Automation Letters，2023，8（2）：1159-1166.
19	SHI G， LI R， MA C ．PillarNet：real-time and high-performance pillar-based 3D object detection［C］∥ Proceedings of the 17th European Conference on Computer Vision．Tel Aviv：Springer，2022：35-52.
20	ZHOU S， TIAN Z， CHU X，et al ．FastPillars：a deployment-friendly pillar-based 3D detector［EB/OL］．（2023-02-05）［2024-03-20］．.
21	程鑫，王宏飞，周经美，等．基于体素柱形的激光雷达点云车辆目标检测算法［J］．中国公路学报，2023，36（3）：247-260.
	CHENG Xin， WANG Hong-fei， ZHOU Jing-mei，et al ．Vehicle detection algorithm based on voxel pillars from LiDAR point cloud［J］．China Journal of Highway and Transport，2023，36（3）：247-260.
22	TANCIK M， SRINIVASAN P P， MILDENHALL B，et al ．Fourier features let networks learn high frequency functions in low dimensional domains［C］∥ Advances in Neural Information Processing Systems 33：34 th Confe-rence on Neural Information Processing Systems．San Diego：Neural Information Processing Systems Foundation，Inc.，2020：7537-7547.
23	VASWANI A， SHAZEER N， PARMAR N，et al ．Attention is all you need［C］∥ Advances in Neural Information Processing Systems 30：31st Conference on Neural Information Processing Systems．San Diego：Neural Information Processing Systems Foundation， Inc.，2017：5999-6009.
24	GUO M H， CAI J X， LIU Z N，et al ．PCT：point cloud transformer［J］．Computational Visual Media，2021，7：187-199.
25	SHI S， WANG Z， SHI J，et al ．From points to parts：3D object detection from point cloud with part-aware and part-aggregation network［J］．IEEE Transactions on Pattern Analysis and Machine Intelligence，2020，43（8）：2647-2664.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献