华南理工大学学报(自然科学版) ›› 2023, Vol. 51 ›› Issue (5): 104-113.doi: 10.12141/j.issn.1000-565X.220623

所属专题: 2023年电子、通信与自动控制

• 电子、通信与自动控制 • 上一篇    下一篇

基于FPGA并行加速的脉冲神经网络在线学习硬件结构的设计与实现

刘怡俊1 曹宇2 叶武剑1 林子琦2   

  1. 1.广东工业大学 集成电路学院,广东 广州 510006
    2.广东工业大学 信息工程学院,广东 广州 510006
  • 收稿日期:2022-09-27 出版日期:2023-05-25 发布日期:2022-12-05
  • 通信作者: 叶武剑(1987-),男,博士,讲师,主要从事类脑计算机、深度学习应用研究。 E-mail:yewjian@gdut.edu.cn
  • 作者简介:刘怡俊(1977-),男,博士,教授,博士生导师,主要从事集成电路设计、类脑计算机、深度学习研究。E-mail:yjliu@gdut.edu.cn
  • 基金资助:
    广东省重点领域研发计划项目(2018B030338001);广州市基础研究计划基础与应用基础研究项目(202201010595);广东省教育厅创新人才项目;广东工业大学“青年百人计划”项目(220413548)

Design and Implementation of Hardware Structure for Online Learning of Spiking Neural Networks Based on FPGA Parallel Acceleration

LIU Yijun1 CAO Yu2 YE Wujian1 LIN Ziqi2   

  1. 1.School of Integrated Circuits, Guangdong University of Technology, Guangzhou 510006, Guangdong, China
    2.School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, Guangdong, China
  • Received:2022-09-27 Online:2023-05-25 Published:2022-12-05
  • Contact: 叶武剑(1987-),男,博士,讲师,主要从事类脑计算机、深度学习应用研究。 E-mail:yewjian@gdut.edu.cn
  • About author:刘怡俊(1977-),男,博士,教授,博士生导师,主要从事集成电路设计、类脑计算机、深度学习研究。E-mail:yjliu@gdut.edu.cn
  • Supported by:
    the Key-Area R&D Program of Guangdong Province(2018B030338001)

摘要:

当前,基于数字电路的脉冲神经网络硬件设计,在学习功能方面的突触并行性不高,导致硬件整体延时较大,在一定程度上限制了脉冲神经网络模型在线学习的速度。针对上述问题,文中提出了一种基于FPGA并行加速的高效脉冲神经网络在线学习硬件结构,通过神经元和突触的双并行设计对模型的训练与推理过程进行加速。首先,设计具有并行脉冲传递功能和并行脉冲时间依赖可塑性学习功能的突触结构;然后,搭建输入编码层和赢家通吃结构的学习层,并优化赢家通吃网络的侧向抑制的实现,形成规模为784~400的脉冲神经网络模型。实验结果表明:在MNIST数据集上,使用该硬件结构的脉冲神经网络模型训练一幅图像需要的时间为1.61 ms、能耗约为3.18 mJ,推理一幅图像需要的时间为1.19 ms、能耗约为2.37 mJ,识别MNIST测试集样本的准确率可达87.51%;在文中设计的硬件框架下,突触并行结构能使训练速度提升38%以上,硬件能耗降低约24.1%,有助于促进边缘智能计算设备及技术的发展。

关键词: 神经网络, 学习算法, 加速, 并行结构

Abstract:

Currently, the hardware design of spiking neural networks based on digital circuits has a low synaptic parallel nature in terms of learning function, leading to a large overall hardware delay, which limits the speed of online learning of spiking neural network models to some extent. To address the above problems, this paper proposed an efficient spiking neural network online learning hardware architecture based on FPGA parallel acceleration, which accelerates the training and inference process of the model through the dual parallel design of neurons and synapses. Firstly, a synaptic structure with parallel spike delivery function and parallel spike time-dependent plasticity learning function was designed; then, the learning layers of input encoding layer and winner-take-all structure were built, and the implementation of lateral inhibition of the winner-take-all network was optimized, forming an impulsive neural network model with a scale of 784~400. The experiments show, the hardware has a training speed of 1.61 ms/image and an energy consumption of about 3.18 mJ/image for the SNN model and an inference speed of 1.19 ms/image and an energy consumption of about 2.37 mJ/image on the MNIST dataset, with an accuracy rate of 87.51%. Based on the hardware framework designed in this paper, the synaptic parallel structure can improve the training speed by more than 38%, and reduce the hardware energy consumption by about 24.1%, which can help to promote the development of edge intelligent computing devices and technologies.

Key words: neural network, learning algorithm, acceleration, parallel architecture

中图分类号: