Research on Video Description Based on Adaptive Frame Sampling Algorithm and Bidirectional Long Short-Term Memory

doi:10.3969/j.issn.1000-565X.2018.01.014

Journal of South China University of Technology(Natural Science Edition) ›› 2018, Vol. 46 ›› Issue (1): 103-111.doi: 10.3969/j.issn.1000-565X.2018.01.014

• Computer Science & Technology • Previous Articles Next Articles

Research on Video Description Based on Adaptive Frame Sampling Algorithm and Bidirectional Long Short-Term Memory

ZHANG Ｒongfeng NING Peiyang XIAO Huanhou SHI Jinglun QIU Wei

School of Electronic and Information Engineering，South China University of Technology

Received:2017-05-16 Revised:2017-06-18 Online:2018-01-25 Published:2017-12-01
Contact: 张荣锋( 1980-) ，男，博士生，主要从事机器学习和视频处理研究 E-mail:rongfzhang@qq.com
About author:张荣锋( 1980-) ，男，博士生，主要从事机器学习和视频处理研究
Supported by:
The National Natural Science Foundation of China ( 61671213)

Abstract

Abstract: Video to text is a new challenging task in the field of computer vision. Focusing on this technical difficulty, this paper proposes an adaptive sampling algorithms and employs the Bidirectional Long-Short Term Memory (BLSTM) model and deep BLSTM based on the video features extracting by deep Convolutional Neural Networks. Since this doubly deep networks structure can learn the spatial and temporal correlation description of the videos, it is able to obtain the global dependency information from space and time domain. Experimental results showed that by using the datasets of M-VAD and MPII-MD, the proposed framework could achieve the average score of 7.8 and 9.1 in METEOR, respectively. Comparing to the original S2VT model, the proposed method outperformed 15.7% and 28.2% by average score and it also improved the descriptions of the videos.

Key words: video to text, adaptive frame sampling, bidirectional LSTM, deep convolutional neural networks, fusion information of frames.

CLC Number:

TP391

ZHANG Ｒongfeng NING Peiyang XIAO Huanhou SHI Jinglun QIU Wei. Research on Video Description Based on Adaptive Frame Sampling Algorithm and Bidirectional Long Short-Term Memory[J]. Journal of South China University of Technology(Natural Science Edition), 2018, 46(1): 103-111.

[1]	LI Haiyan, YIN Haolin, LI Peng, et al.. Image Inpainting Algorithm Based on Dense Feature Reasoning and Mix Loss Function [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(9): 99-109.
[2]	LIU Yijun, WANG Jiada, ZHONG Shijie, et al.. Fast Multi-View Clustering Based on Uniform Label Matrix [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(9): 110-119.
[3]	WANG Shiyong, QIAN Guokang, LI Di, et al.. Real-Time Template Matching Method for Edge Features [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(9): 1-10.
[4]	LI Jiachun, LI Bowen, LIN Weiwei. AdfNet: An Adaptive Deep Forgery Detection Network Based on Diverse Features [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(9): 82-89.
[5]	MA Xiaoliang, AN Lingling, DENG Congjian, et al. Translation Optimization Technology of Automatic Speech Recognition Based on Industry-Specific Vocabulary [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(8): 118-125.
[6]	LIN Zhijian, HUANG Ping, ZHENG Mingkui, et al. Hardware Acceleration Design of HEVC Entropy Encoding Syntax Elements Based on FPGA [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(8): 110-117.
[7]	HAN Le, JIANG Yihua. Robust TruncatedL1-L2Total Variation Sparse Restoration Models [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 45-53,140.
[8]	ZHU Zhengyu, LUO Chao, HE Qianhua, et al. Multi-View Lip Motion and Voice Consistency Judgment Based on Lip Reconstruction and Three-Dimensional Coupled CNN [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 70-77.
[9]	LU Lu, LAI Jinxiong. Smart Contract Vulnerability Detection Method Based on Capsule Network and Attention Mechanism [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 36-44.
[10]	LIN Zhijian, DING Yongqiang, YANG Xiuzhi, et al. Parallel Pipeline Hardware Design of Intra Rate-Distortion Optimization Prediction Mode in HEVC [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 95-103.
[11]	YE Feng, CHEN Biao, LAI Yizong. Contrastive Knowledge Distillation Method Based on Feature Space Embedding [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 13-23.
[12]	MA Biyun, WU Gang, LIU Jiaojiao, et al. Low Complexity Blood Flow Velocity Estimation Algorithm via Sparse Pulse Sampling [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 63-69.
[13]	LIU Yupeng, ZHANG Lei. Cognitive Diagnosis Model Integrating Forgetting and Importance of Knowledge Points [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 54-62.
[14]	ZHANG Yan, XU Changkang, CAO Liqing, et al. Cross-Domain Pressure Footprint Images Retrieval Based on Mutual Information Disentangled Representations [J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(5): 78-85.
[15]	. Point Cloud Classification Based on Offset Attention Mechanism and Multi-feature Fusion [J]. Journal of South China University of Technology(Natural Science Edition), 0, (): 0-.

Research on Video Description Based on Adaptive Frame Sampling Algorithm and Bidirectional Long Short-Term Memory

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments