Journal of South China University of Technology(Natural Science Edition) ›› 2019, Vol. 47 ›› Issue (6): 51-56.doi: 10.12141/j.issn.1000-565X.180360

• Computer Science & Technology • Previous Articles     Next Articles

Method of Network Compression and Hardware Acceleration Based on Tiny-yolo

HUANG Zhiyong WU Haihua YU Zhi ZHONG Yuanhong    

  1. School of Microelectronics and Communication Engineering,Chongqing University,Chongqing 400044,China 
  • Received:2018-07-08 Revised:2018-11-13 Online:2019-06-25 Published:2019-05-05
  • Contact: 黄智勇(1978-),男,博士,副教授,主要从事无线传感器网络建模和高效能嵌入式计算研究. E-mail:zyhuang@cqu.edu.cn
  • About author:黄智勇(1978-),男,博士,副教授,主要从事无线传感器网络建模和高效能嵌入式计算研究.
  • Supported by:
     Supported by the National Natural Science Foundation of China(61501069)

Abstract: Existing works based on Tiny-yolo often need large-scale network model,occupy more memories,rely on massive calculation and are not easy to deploy on embedded devices. To solve these problems,an efficient optimi- zation method on network compression and hardware acceleration was proposed. Firstly,connections which have less contribution to the network was pruned after analyzing the network connection relationship and sparse storage was adopted for the clipped weight matrix to reduce the memory consumption. Secondly,memory footprint and com- putational complexity within the guaranteed accuracy error was further reduced through quantifying the weight data and changing the number of digits. Finally,according to the characteristics of the Tiny-yolo network structure,a deep parallel-stream FPGA acceleration optimization scheme was proposed and the hardware acceleration of the Ti- ny-yolo network computation was achieved. Experiments demonstrate that the purposed method based on network pruning and quantization can achieve about 36X compression for network model and approximately 7X speedup compared with CPU by hardware acceleration.

Key words: neural network, Tiny-yolo, compression, hardware acceleration, FPGA

CLC Number: