Journal of South China University of Technology(Natural Science Edition) ›› 2024, Vol. 52 ›› Issue (6): 1-11.doi: 10.12141/j.issn.1000-565X.230262

• Green & Intelligent Transportation • Previous Articles     Next Articles

Visual SLAM Algorithm Based on Memory Parking Scene

HU Xizhi(), CUI Bofei(), WANG Qin, LIU Hong   

  1. School of Mechanical and Automotive Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2023-04-22 Online:2024-06-25 Published:2023-10-27
  • Contact: 崔博非(1998—),男,硕士生,主要从事智能驾驶与新能源汽车研究。 E-mail:klysxc616@163.com
  • About author:胡习之(1963—),男,博士,副教授,主要从事汽车动力学及智能驾驶研究。E-mail: huxizhi@scut.edu.cn
  • Supported by:
    the National Natural Science Foundation of China(51975219)

Abstract:

With the development of autonomous driving technology, visual simultaneous localization and mapping (SLAM) technology has attracted more and more attention. In the memory parking scene, it is necessary to establish a prior map of the parking lot scene. Thus, when the car enters the same parking lot again, visual SLAM can help to construct and locate the scene. In order to improve the robustness, accuracy and efficiency of the map built by SLAM, first, a lightweight deep learning algorithm is used to improve the poor robustness of the traditional feature extraction algorithms in different scenarios, and the deep separable convolution is adopted to replace the previous common convolution structure, which greatly improves the time efficiency of feature extraction. Next, the Patch-NetVLAD algorithm is improved based on ResNet network, and the improved residual network as well as the original VGG network is retrained on MSLS data set. Then, image retrieval is used for rough positioning, candidate image frames are selected, and camera pose is solved by fine positioning to complete global initialization relocation. On this basis, the improved bag of words algorithm is used to retrain the images in different parking lot scenes, and all the algorithms are transplanted into the OpenVSLAM architecture to complete the mapping and positioning of the actual scene. The experimental results show that the proposed visual SLAM system can complete the construction of many scenes such as aboveground, underground and semienclosed parking lots, with an average longitudinal positioning error of 8.42 cm and an average horizontal positioning error of 8.30 cm, both of which meet the engineering requirements.

Key words: simultaneous localization and mapping, memory parking, deep learning, feature extraction, image retrieval

CLC Number: