Journal of South China University of Technology(Natural Science) >
Feature-Space Optimization-Inspired and Multi-Hypothesis Cross-Attention Reconstruction Neural Network for Video Compressive Sensing
Received date: 2023-09-14
Online published: 2024-03-11
Supported by
the Natural Science Foundation of Guangdong Province(2019A1515011949)
The existing video compressive sensing reconstruction network usually uses the optical flow network to achieve pixel domain motion estimation and motion compensation. However, during the reconstruction process, the input of the optical flow network is the estimated frame with poor quality, resulting in inaccurate optical flow. The optical flow-based pixel domain alignment and fusion operation will cause noise accumulation, lead to obvious artificial effects in video reconstruction frames and affect the reconstruction quality. Based on the fact that multi-channel information in the feature space has strong robustness to interference noise, this paper applied the idea of feature space optimization to the design of the video compressive sensing reconstruction neural network, and proposed a feature-space optimization-inspired and flow-guided multi-hypothesis cross-attention network (FOFMCNet). To avoid the image structure destruction caused by the noise in the optical flow when warping the image, the study designed multi-hypothesis motion estimation module guided by optical flow and the motion compensation module based on cross-attention to realize the motion estimation and motion compensation of inter-frame in feature space, so as to make full use of inter-frame correlation to assist non-key frame reconstruction. In order to strengthen the reuse of effective information in the process of feature optimization, improve the learning ability of the network and alleviate the problem of gradient explosion, this paper designed a feature-space optimization-inspired u-shape network (FOUNet) as a sub-network of FOFMCNet. Through the cascade of multiple FOUNets, the FOFMCNet realizes the optimization and reconstruction of non-key frames in the feature space. Experimental results show that the reconstruction results of the proposed algorithm are obviously better than those of the existing video compression sensing algorithms on the classical low-resolution dataset (UCF-101 and QCIF) and new high-resolution dataset (REDS4).
YANG Chunling , CHEN Wenjun , LIU Jiahui . Feature-Space Optimization-Inspired and Multi-Hypothesis Cross-Attention Reconstruction Neural Network for Video Compressive Sensing[J]. Journal of South China University of Technology(Natural Science), 2024 , 52(10) : 9 -21 . DOI: 10.12141/j.issn.1000-565X.230578
| 1 | DONOHO D L .Compressed sensing[J].IEEE Tran-sactions on Information Theory,2006,52(4):1289-1306. |
| 2 | MIDDYA R, CHAKRAVARTY N, NASKAR M K .Compressive sensing in wireless sensor networks:a survey[J].IETE Technical Review,2017,34(6):642-654. |
| 3 | ZHOU S, XU T, DONG S,et al .RDFNet:regional dynamic FISTA-Net for spectral snapshot compressive imaging[J].IEEE Transactions on Computational Imaging,2023,9:490-501. |
| 4 | 杨婧,苗延巍 .压缩感知技术在颅脑MRI的临床应用进展[J].磁共振成像,2022,13(9):144-147,159. |
| YANG Jing, MIAO Yanwei .Clinical application progress of compressed sensing technology in cranial MRI[J].Magnetic Resonance Imaging,2022,13(9):144-147,159. | |
| 5 | WANG Z, ZHANG H, CHENG Z,et al .MetaSCI:scalable and adaptive reconstruction for video compressive sensing[C]∥ Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021: 2083-2092. |
| 6 | BARON D, DUARTE M F, WAKIN M B,et al .Distributed compressive sensing[EB/OL].(2009-01-22)[2023-10-08].. |
| 7 | 韩乐,江怡华 .鲁棒截断L1-L2全变分稀疏恢复模型[J].华南理工大学学报(自然科学版),2023,51(5):45-53. |
| HAN Le, JIANG Yihua .Robust truncated L1-L2 total variation sparse restoration models[J].Journal of South China University of Technology (Natural Science Edition),2023,51(5):45-53. | |
| 8 | LIU Y, YUAN X, SUO J L,et al .Rank minimization for snapshot compressive imaging[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(12):2990-3006. |
| 9 | CHENG Z H, LU R Y, WANG Z J,et al .BIRNAT:bidirectional recurrent neural networks with adversarial training for video snapshot compressive imaging[C]∥ Proceedings of the 16th European Conference on Computer Vision.Glasgow:Springer,2020:258-275. |
| 10 | CHENG Z H, CHEN B, LIU G L,et al .Memory-efficient network for large-scale video compressive sensing[C]∥ Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:16246-16255. |
| 11 | MUN S, FOWLER J E .Residual reconstruction for block-based compressed sensing of video[C]∥ Proceedings of 2011 Data Compression Conference.Snowbird:IEEE,2011:183-192. |
| 12 | TRAMEL E W, FOWLER J E .Video compressed sensing with multi hypothesis[C]∥ Proceedings of 2011 Data Compression Conference.Snowbird:IEEE,2011:193-202. |
| 13 | OU W F, YANG C L, LI W H,et al .A two-stage multi-hypothesis reconstruction scheme in compressed video sensing[C]∥ Proceedings of 2016 IEEE International Conference on Image Processing.Phoenix:IEEE,2016:2494-2498. |
| 14 | ZHAO C, MA S, ZHANG J,et al .Video compressive sensing reconstruction via reweighted residual sparsity[J].IEEE Transactions on Circuits and Systems for Video Technology,2017,27(6):1182-1195. |
| 15 | 和志杰,杨春玲,汤瑞东 .视频压缩感知中基于结构相似的帧间组稀疏表示重构算法研究[J].电子学报,2018,46(3):544-553. |
| HE Zhi-jie, YANG Chun-ling, TANG Rui-dong .Research on structural similarity based inter-frame group sparse representation for compressed video sensing[J].Acta Electronica Sinica,2018,46(3):544-553. | |
| 16 | XU K, REN F .CSVideoNet:a real-time end-to-end learning framework for high-frame-rate video compressive sensing[C]∥ Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision.Lake Tahoe:IEEE,2018:1680-1688. |
| 17 | LING X, YANG C, PEI H.Compressed video sensing network based on alignment prediction and residual reconstruction[C]∥ Proceedings of 2020 IEEE International Conference on Multimedia and Expo.London:IEEE,2020:9102723/1-6. |
| 18 | DAI J, QI H, XIONG Y,et al .Deformable convolutional networks[C]∥ Proceedings of 2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:764-773. |
| 19 | 魏志超,杨春玲 .时域注意力特征对齐的视频压缩感知重构网络[J].电子学报,2022,50(11):2584-2592. |
| WEI Zhi-chao, YANG Chun-ling .Video compressed sensing reconstruction network based on temporal-attention feature alignment[J].Acta Electronica Sinica,2022,50(11):2584-2592. | |
| 20 | XUAN Y, YANG C .2sER-VGSR-Net:a two-stage enhancement reconstruction based on video group sparse representation network for compressed video sensing[C]∥ Proceedings of 2020 IEEE International Conference on Multimedia and Expo.London:IEEE,2020:9102849/1-6. |
| 21 | YANG X, YANG C .ImrNet:an iterative motion compensation and residual reconstruction network for video compressed sensing[C]∥ Proceedings of 2021 IEEE International Conference on Acoustics,Speech and Signal Processing.Toronto:IEEE,2021:2350-2354. |
| 22 | RANJAN A, BLACK M J .Optical flow estimation using a spatial pyramid network[C]∥ Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:4161-4170. |
| 23 | RONNEBERGER O, FISCHER P, BROX T .U-Net:convolutional networks for biomedical image segmentation[C]∥ Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention.Munich:Springer,2015:234-241. |
| 24 | WEI Z, YANG C, XUAN Y .Efficient video compressed sensing reconstruction via exploiting spatial-temporal correlation with measurement constraint[C]∥ Proceedings of 2021 IEEE International Conference on Multimedia and Expo.Shenzhen:IEEE,2021:9428203/1-6. |
| 25 | SUN D, YANG X, LIU M Y,et al .PWC-Net:CNNs for optical flow using pyramid,warping,and cost volume[C]∥ Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:8934-8943. |
| 26 | CHEN W, YANG C, YANG X .FSOINeT:feature-space optimization-inspired network for image compressive sensing[C]∥ Proceedings of 2022 IEEE International Conference on Acoustics,Speech and Signal Processing.Singapore:IEEE,2022:2460-2464. |
| 27 | MEHRI A, ARDAKANI P B, SAPPA A D .MPRNet:multi-path residual network for lightweight image super resolution[C]∥ Proceedings of 2021 IEEE Winter Conference on Applications of Computer Vision.Waikoloa:IEEE,2021:2704-2713. |
| 28 | GAO Y, ZHANG M, WANG J,et al .Cross-scale mixing attention for multisource remote sensing data fusion and classification[J].IEEE Transactions on Geoscience and Remote Sensing,2023,61:5507815/1-15. |
| 29 | SOOMRO K, ZAMIR A R, SHAH M .UCF101:a dataset of 101 human actions classes from videos in the wild[EB/OL].(2012-12-03)[2023-10-08].. |
| 30 | LOSHCHILOV I, HUTTER F .SGDR:stochastic gradient descent with warm restarts[EB/OL].(2017-05-03)[2023-10-08].. |
| 31 | SHI W Z, LIU S H, JIANG F,et al .Video compressed sensing using a convolutional neural network[J].IEEE Transactions on Circuits and Systems for Video Technology,2020,31(2):425-438. |
| 32 | YANG X, YANG C .MAP-inspired deep unfolding network for distributed compressive video sensing[J].IEEE Signal Processing Letters,2023,30:309-313. |
/
| 〈 |
|
〉 |