Computer Science & Technology

Method of Network Compression and Hardware Acceleration Based on Tiny-yolo

Expand
  • School of Microelectronics and Communication Engineering,Chongqing University,Chongqing 400044,China 
黄智勇(1978-),男,博士,副教授,主要从事无线传感器网络建模和高效能嵌入式计算研究.

Received date: 2018-07-08

  Revised date: 2018-11-13

  Online published: 2019-05-05

Supported by

 Supported by the National Natural Science Foundation of China(61501069)

Abstract

Existing works based on Tiny-yolo often need large-scale network model,occupy more memories,rely on massive calculation and are not easy to deploy on embedded devices. To solve these problems,an efficient optimi- zation method on network compression and hardware acceleration was proposed. Firstly,connections which have less contribution to the network was pruned after analyzing the network connection relationship and sparse storage was adopted for the clipped weight matrix to reduce the memory consumption. Secondly,memory footprint and com- putational complexity within the guaranteed accuracy error was further reduced through quantifying the weight data and changing the number of digits. Finally,according to the characteristics of the Tiny-yolo network structure,a deep parallel-stream FPGA acceleration optimization scheme was proposed and the hardware acceleration of the Ti- ny-yolo network computation was achieved. Experiments demonstrate that the purposed method based on network pruning and quantization can achieve about 36X compression for network model and approximately 7X speedup compared with CPU by hardware acceleration.

Cite this article

HUANG Zhiyong WU Haihua YU Zhi ZHONG Yuanhong . Method of Network Compression and Hardware Acceleration Based on Tiny-yolo[J]. Journal of South China University of Technology(Natural Science), 2019 , 47(6) : 51 -56 . DOI: 10.12141/j.issn.1000-565X.180360

References

 
Outlines

/