Visual SLAM Algorithm Based on Memory Parking Scene

HU Xizhi; CUI Bofei; WANG Qin; LIU Hong

doi:10.12141/j.issn.1000-565X.230262

Journal of South China University of Technology(Natural Science) >

2024 , Vol. 52 >Issue 6: 1 - 11

DOI: https://doi.org/10.12141/j.issn.1000-565X.230262

Green & Intelligent Transportation

Visual SLAM Algorithm Based on Memory Parking Scene

HU Xizhi ,
CUI Bofei ,
WANG Qin ,
LIU Hong

Expand

School of Mechanical and Automotive Engineering，South China University of Technology，Guangzhou 510640，Guangdong，China

Received date: 2023-04-22

Online published: 2023-10-24

Supported by

the National Natural Science Foundation of China(51975219)

Fold

Abstract

With the development of autonomous driving technology, visual simultaneous localization and mapping (SLAM) technology has attracted more and more attention. In the memory parking scene, it is necessary to establish a prior map of the parking lot scene. Thus, when the car enters the same parking lot again, visual SLAM can help to construct and locate the scene. In order to improve the robustness, accuracy and efficiency of the map built by SLAM, first, a lightweight deep learning algorithm is used to improve the poor robustness of the traditional feature extraction algorithms in different scenarios, and the deep separable convolution is adopted to replace the previous common convolution structure, which greatly improves the time efficiency of feature extraction. Next, the Patch-NetVLAD algorithm is improved based on ResNet network, and the improved residual network as well as the original VGG network is retrained on MSLS data set. Then, image retrieval is used for rough positioning, candidate image frames are selected, and camera pose is solved by fine positioning to complete global initialization relocation. On this basis, the improved bag of words algorithm is used to retrain the images in different parking lot scenes, and all the algorithms are transplanted into the OpenVSLAM architecture to complete the mapping and positioning of the actual scene. The experimental results show that the proposed visual SLAM system can complete the construction of many scenes such as aboveground, underground and semienclosed parking lots, with an average longitudinal positioning error of 8.42 cm and an average horizontal positioning error of 8.30 cm, both of which meet the engineering requirements.

Key words： simultaneous localization and mapping; memory parking; deep learning; feature extraction; image retrieval

Cite this article

HU Xizhi , CUI Bofei , WANG Qin , LIU Hong . Visual SLAM Algorithm Based on Memory Parking Scene[J]. Journal of South China University of Technology(Natural Science), 2024 , 52(6) : 1 -11 . DOI: 10.12141/j.issn.1000-565X.230262

References

1	DAVISON A J， REID I D， MOLTON N D，et al ．MonoSLAM：real-time single camera SLAM［J］．IEEE Transactions on Pattern Analysis and Machine Intelligence，2007，29（6）：1052-1067．
2	田超然．面向视觉SLAM的联合特征匹配和跟踪算法研究［D］．深圳：中国科学院深圳先进技术研究院，2020．
3	KONDA K， MEMISEVIC R ．Learning visual odometry with a convolutional network［C］∥Proceedings of the 10th International Conference on Computer Vision Theory and Applications．［S. l.］：SciTePress，2015：486-490．
4	COSTANTE G， MANCINI M， VALIGI P，et al ．Exploring representation learning with CNNs for frame-to-frame ego-motion estimation［J］．IEEE Robotics and Automation Letters，2016，1（1）：18-25．
5	ULLMAN S ．The interpretation of structure from motion［J］．Proceedings of the Royal Society of London，Series B，Biological Sciences，1979，203（1153）：405-426．
6	ZHOU H， UMMENHOFER B， BROX T ．DeepTAM：deep tracking and mapping［C］∥Proceedings of the 2018 European Conference on Computer Vision．［S. l.］：［s. n.］，2018：822-838．
7	NEWCOMBE R A， LOVEGROVE S J， DAVISON A J ．DTAM：dense tracking and mapping in real-time［C］∥Proceedings of 2011 International Conference on Computer Vision．［S. l.］：IEEE，2011：2320-2327．
8	HANDA A， BLOESCH M， P?TR?UCEAN V，et al ．gvnn：neural network library for geometric computer vision［C］∥Proceedings of Computer Vision—ECCV 2016 Workshops．Amsterdam：Springer International Publishing，2016：67-82．
9	WANG S， CLARK R， WEN H，et al ．Deepvo：towards end-to-end visual odometry with deep recurrent convolutional neural networks［C］∥Proceedings of 2017 IEEE International Conference on Robotics and Automation．［S. l.］：IEEE，2017：2043-2050．
10	TANG J， FOLKESSON J， JENSFELT P ．Geometric correspondence network for camera motion estimation［J］．IEEE Robotics & Automation Letters，2018，3（2）：1010-1017．
11	兰凤崇，李继文，陈吉清．面向动态场景复合深度学习与并行计算的DG-SLAM算法［J］．吉林大学学报（工学版），2021，51（4）：1437-1446．
	LAN Feng-chong， LI Ji-wen， CHEN Ji-qing ．DG-SLAM algorithm for dynamic scene compound deep learning and parallel computing［J］．Journal of Jilin University （Engineering and Technology Edition），2021，51（4）：1437-1446．
12	阮晓钢，郭佩远，黄静．动态场景下基于深度学习的语义视觉SLAM［J］．北京工业大学学报，2022，48（1）：16-23．
	RUAN Xiaogang， GUO Peiyuan， HUANG Jing ．Semantic visual SLAM based on deep learning in dynamic scenes［J］．Journal of Beijing University of Technology，2022，48（1）：16-23．
13	DETONE D， MALISIEWICZ T， RABINOVICH A ．Superpoint：self-supervised interest point detection and description［C］∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops．［S. l.］：IEEE，2018：224-236．
14	SCHMID C， MOHR R， BAUCKHAGE C ．Evaluation of interest point detectors［J］．International Journal of Computer Vision，2000，37（2）：151-172．
15	GEIGER A， LENZ P， STILLER C，et al ．Vision meets robotics：the kitti dataset［J］．The International Journal of Robotics Research，2013，32（11）：1231-1237．
16	STURM J， ENGELHARD N， ENDRES F，et al ．A benchmark for the evaluation of RGB-D SLAM systems［C］∥Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems．［S. l．］：IEEE，2012：573-580．

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References