Journal of South China University of Technology(Natural Science Edition) ›› 2023, Vol. 51 ›› Issue (7): 21-33.doi: 10.12141/j.issn.1000-565X.220626

Special Issue: 2023年机械工程

• Mechanical Engineering • Previous Articles     Next Articles

Small-Sample Fault Diagnosis Method Based on Multi-Head Convolution and Differential Self-Attention

CHEN Xindu1 FU Zhisen1,2,3 WU Zhiheng2,3 CHEN Qiyu2,3 GUO Weike2,3   

  1. 1.School of Mechanical and Electrical Engineering,Guangdong University of Technology,Guangzhou 510006,Guangdong,China
    2.Intelligent Manufacturing Research Institute,Guangdong Academy of Sciences,Guangzhou 510030,Guangdong,China
    3.Guangdong Provincial Key Laboratory of Modern Control Technology,Guangzhou 510030,Guangdong,China
  • Received:2022-09-26 Online:2023-07-25 Published:2023-02-20
  • Contact: 陈新度(1967-),男,博士,教授,博士生导师,主要从事智能装备、制造系统建模以及仿真优化等研究。 E-mail:chenxindu@gdut.edu.cn
  • About author:陈新度(1967-),男,博士,教授,博士生导师,主要从事智能装备、制造系统建模以及仿真优化等研究。
  • Supported by:
    the Guangdong Province Key Field R&D Program(2019B090917004)

Abstract:

Bearing is one of the most widely used rotating parts in industrial equipment. If the bearing runs in fault condition for a long time, it will cause huge economic loss and threaten personal safety, so that the investigation of bearing fault diagnosis is of great significance. Fault diagnosis technology based on deep learning is becoming more and more mature, but there are problems such as over-fitting, unstable effect and low accuracy in the case of small samples. In order to solve these problems, this paper proposes a Transformer variant model MDT (Multi-Head Convolution and Differential Self-Attention Transformer) to realize end-to-end few-shot fault diagnosis. This model combines the new data embedding algorithm of MC (Multi-Head Convolution) and the DSA (Differential Self-Attention) mechanism. The MC algorithm performs multi-path one-dimension convolution on the sample, extends the sample from one dimension to two dimensions by multi-channel output, and extracts rich fault information in each frequency domain in the original sample through multiple convolution kernel sizes. As compared with the original dot product self-attention in Transformer, the DSA mechanism obtains the corresponding attention weight vector for each feature through the difference, so as to extract deeper fault features from the sample. MDT inherits the powerful ability of Transformer to process sequence data, which can extract richer fault information from time-domain signals and avoid the overfitting problem common in small-sample models. Experimental results show that the proposed method can stably obtain more than 99% test accuracy in the bearing fault diagnosis task with only 100 training samples per fault type, and has strong anti-overfitting ability and strong robustness.

Key words: multi-head convolution, differential self-attention, Transformer variant, small sample, fault diagnosis

CLC Number: