Journal of South China University of Technology(Natural Science Edition) ›› 2024, Vol. 52 ›› Issue (4): 42-50.doi: 10.12141/j.issn.1000-565X.230327

• Mechanical Engineering • Previous Articles     Next Articles

Multi-Object Recognition and 6-DoF Pose Estimation Based on Synthetic Datasets

HU Guanghua OU Meitong LI Zhendong   

  1. School of Mechanical and Automotive Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2023-05-18 Online:2024-04-25 Published:2023-10-06
  • About author:胡广华(1980-),男,博士,副教授,主要从事机器视觉和图像处理研究。E-mail:ghhu@scut.edu.cn
  • Supported by:
    the Natural Science Foundation of Guangdong Province(2022A1515010806)

Abstract:

Multi-object recognition and 6-DoF (degree of freedom) pose estimation are the key to achieve automatic sorting of robots in the state of unordered stacking of materials. In recent years, methods based on deep neural networks have received much attention in the multi-object recognition and 6-DoF pose estimation fields. Such methods rely on a large number of training samples, however, the collection and labeling of samples is time-consuming and laborious, which limits its application. In addition, when the imaging conditions are poor and the targets are occluded by each other, the existing pose estimation methods cannot guarantee the reliability of the results, resulting in grasping failures. To this end, this paper presented a method for target recognition, segmentation and pose estimation based on synthetic data samples. Firstly, multi-view RGB-D synthetic images of virtual scenes were generated using 3D graphics programming tools based on the 3D geometric models of the target objects, and then style transfer and noise enhancement was performed, respectively, on the generated RGB images and the depth images to improve their realism, so that they are suited for the detection in real scenes. Next, the YOLOv7-mask instance segmentation model was trained with synthetic datasets and tested by real data. The results demonstrate the effectiveness of the proposed method. Secondly, the ES6D model was utilized to estimate target poses based on the segmentation results, and an online posture evaluation method was proposed to automatically filter out severely distorted estimation results. Finally, a pose estimation correction strategy based on active vision technique was proposed to guide the robot arm to move to a new viewpoint for re-detection, which can effectively solve the problem of pose estimation deviation caused by occlusion. The above methods have been verified on a self-built 6-DoF industrial robot vision sorting system. The experimental results show that the proposed algorithm can well meet the requirements of recognition and 6-DoF posture estimation of common workpieces in complex environments.

Key words: object recognition, position measurement, 6-DoF pose estimation, robot automatic sorting, RGB-D image

CLC Number: