华南理工大学学报(自然科学版) ›› 2018, Vol. 46 ›› Issue (12): 26-33.doi: 10.3969/j.issn.1000-565X.2018.12.004

• 计算机科学与技术 • 上一篇    下一篇

结合关联特征和卷积神经网络的视频目标检测

刘玉杰1 曹先知1 李宗民1 李华2,3   

  1. 1. 中国石油大学(华东)计算机与通信工程学院,山东 青岛 266580; 2. 中国科学院 计算技术研究所,北京 100190; 3. 中国科学院大学,北京 100190
  • 收稿日期:2018-07-15 出版日期:2018-12-25 发布日期:2018-11-01
  • 通信作者: 刘玉杰(1971-),男,博士,副教授,主要从事计算机图形图像处理、多媒体数据库、多媒体数据压缩研究. E-mail:liuyujie@upc.edu.cn
  • 作者简介:刘玉杰(1971-),男,博士,副教授,主要从事计算机图形图像处理、多媒体数据库、多媒体数据压缩研究.
  • 基金资助:
    国家自然科学基金资助项目(61379106);山东省自然科学基金资助项目(ZR2015FM011,ZR2013FM036)

Video Object Detection Based on Correlation Feature and Convolutional Neural Network

LIU Yujie1 CAO Xianzhi1 LI Zongmin1 LI Hua2,3   

  1. 1. College of Computer & Communication Engineering,China University of Petroleum,Qingdao 266580,Shandong,China; 2. Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China; 3. University of Chinese Academy of Sciences,Beijing 100190,China
  • Received:2018-07-15 Online:2018-12-25 Published:2018-11-01
  • Contact: 刘玉杰(1971-),男,博士,副教授,主要从事计算机图形图像处理、多媒体数据库、多媒体数据压缩研究. E-mail:liuyujie@upc.edu.cn
  • About author:刘玉杰(1971-),男,博士,副教授,主要从事计算机图形图像处理、多媒体数据库、多媒体数据压缩研究.
  • Supported by:
    Supportal by the National Natural Science Foundation of China(61379106) and the Natural Science Foundation of Shandong Province,China(ZR2015FM011,ZR2013FM036)

摘要: 针对视频目标检测领域中使用图像检测算法存在的速度与精度相互制约的问 题,为充分利用目标在帧之间的运动信息,提出一种结合关联特征和卷积神经网络的视频 检测方法. 首先,当前视频帧使用图像检测算法提取特征,其次,利用两帧的关联特征预测 当前帧的特征图,最后,使用关联特征中的运动信息来修正最终结果. 本文的方法最终在 ImageNet 数据集上进行了实验,结果比当前方法获得了较好的精度提升,同时保持了较快 的速度.

关键词: 视频目标检测, 卷积神经网络, 关联特征

Abstract: The problem of mutual restriction between speed and precision caused by using image detection algorithm in the field of video object detection,a video detection method based on correlation features and convolutional neu- ral network is proposed in order to make full use of the target’s motion between frames. Our methods are demon- strated as follows: firstly,an image detection algorithm is used to extract features from the current video frame; sec- ondly,the correlation features between the frames is employed to predict the feature maps of the current frame and finally,the target motion information from the associated features is used to predict the final result. The method proposed in this paper finally experimented on the ImageNet dataset,which is proved better than the current method since the precision is enhanced and a faster speed is maintained.

Key words: video object detection, convolutional neural network, correlation feature