Journal of South China University of Technology(Natural Science Edition) ›› 2022, Vol. 50 ›› Issue (12): 30-40.doi: 10.12141/j.issn.1000-565X.210541

Special Issue: 2022年计算机科学与技术

• Computer Science & Technology • Previous Articles     Next Articles

Object Tracking Algorithm Based on Multi-Stream Attention Siamese Network

YU Lubin1,2 TIAN Lianfang1,3,4 DU Qiliang1,4,5   

  1. 1.School of Automation Science and Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
    2.The Fifth Electronics Research Institute of the Ministry of Industry and Information Technology,Guangzhou 511370,Guangdong,China
    3.Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai),Zhuhai 519000,Guangdong,China
    4.Key Laboratory of Autonomous Systems and Network Control of the Ministry of Education,Guangzhou 510640,Guangdong,China
    5.China-Singapore International Joint Research Institute,South China University of Technology,Guangzhou 510555,Guangdong,China
  • Received:2021-08-25 Online:2022-12-25 Published:2022-07-21
  • Contact: 杜启亮(1980-),男,博士,副研究员,主要从事模式识别与机器视觉研究。 E-mail:qldu@scut.edu.cn
  • About author:余陆斌(1994-),男,博士,主要从事模式识别与机器视觉研究.E-mail:yulubin94@qq.com.
  • Supported by:
    the Key-Area R&D Project of Guangdong Province(2018B010109001);the Guangdong Provincial Special Project for the Development of Ocean Economy(GDNRC[2020]018)

Abstract:

Object tracking is of great significance in computer vision tasks. Recently, with the development of deep learning, the tracking algorithms based on Siamese networks have been extensively applied because of their excellent capabilities. However, the performance of the existing Siamese network modules degrades significantly when dealing with special situations such as large deformation of the target, low resolution, and complex background. To address these aforementioned issues, this paper proposed a tracking algorithm based on a multi-stream attention Siamese network. This algorithm first constructs super-resolution modules and data enhancement mo-dules, which performs super-resolution and data augmentation on the target templates, respectively, so as to improve the feature characterization ability of the target template. Then, the three backbone networks were used to extract the features of the original target template, the super-resolution target template, and the data augmentation target template, respectively, and their features were fused; simultaneously, the channel attention module and spatial attention module are applied in the backbone network to improve the feature extraction capability. Finally, the fused feature map and the feature map to be searched were input into the region proposal network module to obtain the target tracking information. The experimental results show that the algorithm achieved the precision of 0.919, the success of 0.707 on the OTB100 dataset and the accuracy of 0.642, the robustness of 0.149 on the VOT2018 dataset, with operation speed higher than 20 times per second in real scenarios, demonstrating the excellent tracking performance of the algorithm and excellent robustness in handling various complex scenarios.

Key words: object tracking, Siamese network, super-resolution, data augmentation, attention module

CLC Number: