华南理工大学学报(自然科学版) ›› 2022, Vol. 50 ›› Issue (12): 30-40.doi: 10.12141/j.issn.1000-565X.210541

所属专题: 2022年计算机科学与技术

• 计算机科学与技术 • 上一篇    下一篇

基于多分支注意力孪生网络的目标跟踪算法

余陆斌1,2 田联房1,3,4 杜启亮1,4,5   

  1. 1.华南理工大学 自动化科学与工程学院,广东 广州 510640
    2.工业和信息化部电子第五研究所,广东 广州 511370
    3.南方海洋科学与工程广东实验室,广东 珠海 519000
    4.华南理工大学 自主系统与网络控制教育部重点实验室,广东 广州 510640
    5.华南理工大学 中新国际联合研究院,广东 广州 510555
  • 收稿日期:2021-08-25 出版日期:2022-12-25 发布日期:2022-07-21
  • 通信作者: 杜启亮(1980-),男,博士,副研究员,主要从事模式识别与机器视觉研究。 E-mail:qldu@scut.edu.cn
  • 作者简介:余陆斌(1994-),男,博士,主要从事模式识别与机器视觉研究.E-mail:yulubin94@qq.com.
  • 基金资助:
    广东省重点领域研发计划项目(2018B010109001);广东省海洋经济发展专项(GDNRC[2020]018)

Object Tracking Algorithm Based on Multi-Stream Attention Siamese Network

YU Lubin1,2 TIAN Lianfang1,3,4 DU Qiliang1,4,5   

  1. 1.School of Automation Science and Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
    2.The Fifth Electronics Research Institute of the Ministry of Industry and Information Technology,Guangzhou 511370,Guangdong,China
    3.Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai),Zhuhai 519000,Guangdong,China
    4.Key Laboratory of Autonomous Systems and Network Control of the Ministry of Education,Guangzhou 510640,Guangdong,China
    5.China-Singapore International Joint Research Institute,South China University of Technology,Guangzhou 510555,Guangdong,China
  • Received:2021-08-25 Online:2022-12-25 Published:2022-07-21
  • Contact: 杜启亮(1980-),男,博士,副研究员,主要从事模式识别与机器视觉研究。 E-mail:qldu@scut.edu.cn
  • About author:余陆斌(1994-),男,博士,主要从事模式识别与机器视觉研究.E-mail:yulubin94@qq.com.
  • Supported by:
    the Key-Area R&D Project of Guangdong Province(2018B010109001);the Guangdong Provincial Special Project for the Development of Ocean Economy(GDNRC[2020]018)

摘要:

目标跟踪在计算机视觉任务中有重要的意义。近年来随着深度学习的发展,基于孪生网络的目标跟踪算法因其优异的性能而被广泛应用。然而,现有基于孪生网络的跟踪算法在目标发生较大形变、低分辨率、复杂背景等情况下的跟踪性能通常会显著下降。为此,文中提出了一种基于多分支注意力孪生网络的目标跟踪算法。该算法首先构建了超分辨率模块和数据增强模块,分别对目标模板进行超分辨率和数据增强,提升目标模板的特征表征能力;然后利用3个主干网络分别提取原始目标模板、超分辨率目标模板和数据增强目标模板的特征,并进行特征融合,同时在主干网络中应用了通道注意力模块和空间注意力模块,以提升特征提取能力;最后,将融合后的特征图与待搜索区域的特征图输入区域生成网络模块,得到目标跟踪信息。实验结果表明,该算法在OTB100数据集上的精确率为0.919、成功率为0.707,在VOT2018数据集上的准确率为0.642、鲁棒性为0.149,在实际场景中的运行速度每秒至少20次,说明该算法具有优异的跟踪性能,并且在各种复杂场景下都具有良好的鲁棒性。

关键词: 目标跟踪, 孪生网络, 超分辨率, 数据增强, 注意力模块

Abstract:

Object tracking is of great significance in computer vision tasks. Recently, with the development of deep learning, the tracking algorithms based on Siamese networks have been extensively applied because of their excellent capabilities. However, the performance of the existing Siamese network modules degrades significantly when dealing with special situations such as large deformation of the target, low resolution, and complex background. To address these aforementioned issues, this paper proposed a tracking algorithm based on a multi-stream attention Siamese network. This algorithm first constructs super-resolution modules and data enhancement mo-dules, which performs super-resolution and data augmentation on the target templates, respectively, so as to improve the feature characterization ability of the target template. Then, the three backbone networks were used to extract the features of the original target template, the super-resolution target template, and the data augmentation target template, respectively, and their features were fused; simultaneously, the channel attention module and spatial attention module are applied in the backbone network to improve the feature extraction capability. Finally, the fused feature map and the feature map to be searched were input into the region proposal network module to obtain the target tracking information. The experimental results show that the algorithm achieved the precision of 0.919, the success of 0.707 on the OTB100 dataset and the accuracy of 0.642, the robustness of 0.149 on the VOT2018 dataset, with operation speed higher than 20 times per second in real scenarios, demonstrating the excellent tracking performance of the algorithm and excellent robustness in handling various complex scenarios.

Key words: object tracking, Siamese network, super-resolution, data augmentation, attention module

中图分类号: