基于深度学习的两阶段多假设视频压缩感知重构算法

doi:10.12141/j.issn.1000-565X.200623

华南理工大学学报（自然科学版） ›› 2021, Vol. 49 ›› Issue (6): 88-99.doi: 10.12141/j.issn.1000-565X.200623

所属专题： 2021年电子、通信与自动控制

• 电子、通信与自动控制 • 上一篇下一篇

基于深度学习的两阶段多假设视频压缩感知重构算法

杨春玲凌茜

华南理工大学电子与信息学院，广东广州 510640

收稿日期:2020-10-19 修回日期:2021-02-05 出版日期:2021-06-25 发布日期:2021-06-01
通信作者: 杨春玲（1970-），女，教授，主要从事图像/视频压缩编码、图像质量评价研究。 E-mail:eeclyang@suct.edu.cn
作者简介:杨春玲（1970-），女，教授，主要从事图像/视频压缩编码、图像质量评价研究。
基金资助:
广东省自然科学基金重点资助项目（2017A030311028）；广东省自然科学基金资助项目（2019A1515011949）

Two-Stage Multi-Hypothesis Network for Compressed Video Sensing Reconstruction Algorithms Based on Deep Learning

YANG Chunling LING Xi

School of Electronics and Information, South China University of Technology, Guangzhou 510640, Guangdong, China

Received:2020-10-19 Revised:2021-02-05 Online:2021-06-25 Published:2021-06-01
Contact: 杨春玲（1970-），女，教授，主要从事图像/视频压缩编码、图像质量评价研究。 E-mail:eeclyang@suct.edu.cn
About author:杨春玲（1970-），女，教授，主要从事图像/视频压缩编码、图像质量评价研究。
Supported by:
Supported by the Key Program of Natural Science Foundation of Guangdong Province（2017A030311028）and the Natural Science Foundation of Guangdong Province(2019A1515011949)

摘要/Abstract

摘要： 传统视频压缩感知重构算法重构时延过长，新发展的基于神经网络的视频压缩感知重构算法虽解决了高耗时的问题但未能充分利用视频的时空相关性，重构质量较差。为了解决上述问题，文中提出了基于深度学习的两阶段多假设视频压缩感知重构算法（2sMHNet）。首先，采用时域可变形卷积对齐网络实现基于像素的深度学习多假设预测，在避免了块效应的同时通过自适应参数学习提高了假设集的匹配准确性与权重的计算精度，充分地挖掘了时间相关性得到高质量的预测帧；然后，构建残差重构模块以实现预测帧残差的观测域重构，进一步提升重构质量；最后，为了充分利用图像组帧间相关性，提出了两阶段串行式重构模式，在第一阶段利用细节信息丰富的关键帧提升非关键帧重构质量，在第二阶段利用相关性更强的相邻帧再次进行运动补偿重构，适应运动快且复杂的序列。仿真结果证明，2sMHNet相比于目前优秀的视频压缩感知重构算法具有更加优良的重构性能。

关键词: 视频压缩感知重构算法, 深度学习, 时域可变形卷积对齐网络, 重构性能

Abstract: Traditional Compressed Video Sensing (CVS) reconstruction algorithm is highly time-consuming. Newly developed CVS neural networks can successfully deal with the speed problem, but it fails to make full use of the spatiotemporal correlation of video and leads to a poor performance. To solve this problem, a novel two-stage multi-hypothesis neural network (2sMHNet) was proposed. Firstly, the Temporal Deformable Alignment Network（TDAN）was used to realize pixel based multi-hypothesis prediction. While avoiding block effects, it improves the matching accuracy of the hypothesis set and obtains accurate multi-hypothesis weights by adaptively parameters learning. Then, the residual reconstruction module was constructed to reconstruct the prediction residual with measurements to further improve the reconstruction quality. Finally, in order to make full use of the inter-frame correlation, a two-stage serial reconstruction mode was proposed. In the first stage, as the reconstructed key frames have rich details, they are selected as the reference frame to improve the non-key frames quality. In the second stage, the more relevant adjacent frames are used for motion compensation, which is more conducive to fast and complex sequences. Experimental results demonstrate that the proposed 2sMHNet outperforms the existing good CVS reconstruction algorithms.

Key words: compressed video sensing reconstruction algorithm, deep learning, temporal deformable alignment network, reconstruction performance

中图分类号:

TP391.41

杨春玲凌茜. 基于深度学习的两阶段多假设视频压缩感知重构算法[J]. 华南理工大学学报（自然科学版）, 2021, 49(6): 88-99.

YANG Chunling LING Xi. Two-Stage Multi-Hypothesis Network for Compressed Video Sensing Reconstruction Algorithms Based on Deep Learning [J]. Journal of South China University of Technology (Natural Science Edition), 2021, 49(6): 88-99.

[1]	李方, 郭炜森, 张平, 等. 基于时空双细胞状态的轴承剩余使用寿命预测方法[J]. 华南理工大学学报(自然科学版), 2023, 51(9): 69-81.
[2]	苏锦钿, 余珊珊, 洪晓斌. 一种面向中文拼写纠错的自监督预训练方法[J]. 华南理工大学学报(自然科学版), 2023, 51(9): 90-98.
[3]	李家春, 李博文, 林伟伟. AdfNet：一种基于多样化特征的自适应深度伪造检测网络[J]. 华南理工大学学报(自然科学版), 2023, 51(9): 82-89.
[4]	郭恩强, 符锌砂. 基于特征相似性学习的抛洒物检测方法[J]. 华南理工大学学报(自然科学版), 2023, 51(6): 30-41.
[5]	赵建东, 焦岚馨, 赵志敏, 等. 考虑侧向车换道影响的理论和数据组合驱动的车辆跟驰模型[J]. 华南理工大学学报(自然科学版), 2023, 51(6): 10-19.
[6]	叶峰, 陈彪, 赖乙宗. 基于特征空间嵌入的对比知识蒸馏算法[J]. 华南理工大学学报(自然科学版), 2023, 51(5): 13-23.
[7]	侯力玮, 王恒升, 邹浩然. 基于深度学习的玻璃基板铲起过程作用力预测[J]. 华南理工大学学报(自然科学版), 2022, 50(8): 71-81.
[8]	莫建文, 朱彦桥, 袁华, 等. 基于神经元正则和资源释放的增量学习[J]. 华南理工大学学报(自然科学版), 2022, 50(6): 71-79,90.
[9]	陆璐, 钟文煜, 吴小坤. 基于多尺度视觉Transformer的图像篡改定位[J]. 华南理工大学学报(自然科学版), 2022, 50(6): 10-18.
[10]	张勤, 胡嘉辉, 任海林. 饲喂辅助机器人的智能推料方法与试验研究[J]. 华南理工大学学报(自然科学版), 2022, 50(6): 111-120.
[11]	杨春玲, 凌茜, 吕泽宇. 特征域多假设预测视频压缩感知重构神经网络[J]. 华南理工大学学报(自然科学版), 2022, 50(6): 80-90.
[12]	沃焱, 梁籍云, 韩国强. 基于度量学习的跨模态人脸检索方法[J]. 华南理工大学学报(自然科学版), 2022, 50(6): 1-9.
[13]	赵建东, 朱丹, 刘佳欣. 基于时间序列分解与门控循环单元的地铁换乘客流预测 [J]. 华南理工大学学报(自然科学版), 2022, 50(5): 22-31.
[14]	苏锦钿洪晓斌余珊珊. 基于多模型集成的语义文本相似性判断[J]. 华南理工大学学报(自然科学版), 2022, 50(4): 1-9.
[15]	冯浩王年唐俊. 面向大规模图像检索的深度多尺度注意力哈希网络[J]. 华南理工大学学报(自然科学版), 2022, 50(4): 35-45.

基于深度学习的两阶段多假设视频压缩感知重构算法

Two-Stage Multi-Hypothesis Network for Compressed Video Sensing Reconstruction Algorithms Based on Deep Learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价