基于多头卷积和差分自注意力的小样本故障诊断方法

陈新度, 扶治森, 吴智恒, 等

doi:10.12141/j.issn.1000-565X.220626

华南理工大学学报(自然科学版) >

2023 , Vol. 51 >Issue 7: 21 - 33

DOI: https://doi.org/10.12141/j.issn.1000-565X.220626

机械工程

基于多头卷积和差分自注意力的小样本故障诊断方法

展开

^1.广东工业大学机电工程学院, 广东广州 510006
^2.广东省科学院智能制造研究所, 广东广州 510030
^3.广东省现代控制技术重点实验室, 广东广州 510030

陈新度（1967-），男，博士，教授，博士生导师，主要从事智能装备、制造系统建模以及仿真优化等研究。

收稿日期: 2022-09-26

网络出版日期: 2023-02-20

基金资助

广东省重点领域研发计划项目(2019B090917004);广州市重点研发计划项目(202206030006);广州市黄埔区国际科技合作项目(2021GH13)

收起

Small-Sample Fault Diagnosis Method Based on Multi-Head Convolution and Differential Self-Attention

Expand

^1.School of Mechanical and Electrical Engineering，Guangdong University of Technology，Guangzhou 510006，Guangdong，China
^2.Intelligent Manufacturing Research Institute，Guangdong Academy of Sciences，Guangzhou 510030，Guangdong，China
^3.Guangdong Provincial Key Laboratory of Modern Control Technology，Guangzhou 510030，Guangdong，China

陈新度（1967-），男，博士，教授，博士生导师，主要从事智能装备、制造系统建模以及仿真优化等研究。

Received date: 2022-09-26

Online published: 2023-02-20

Supported by

the Guangdong Province Key Field R&D Program(2019B090917004)

Fold

摘要

轴承是工业设备中使用最广泛的旋转部件之一，如果轴承在故障状况下运行较长时间，将会造成巨大的经济损失并威胁人身安全，因此，对轴承故障诊断进行研究具有十分重要的意义。基于深度学习的故障诊断技术目前日趋成熟，但在小样本情况下存在过拟合、效果不稳定、准确率不高等问题。为了解决这类问题，文中提出了一种融合多头卷积（Multi-Head Convolution，MC）的数据嵌入新算法和差分自注意力（Differential Self-Attention，DSA）机制的Transformer变种模型MDT（Multi-Head Convolution and Differential Self-Attention Transformer），以实现端到端的小样本故障诊断。MC算法对样本进行多路径一维卷积，由多通道输出将样本从一维扩展到二维，通过多个卷积核尺寸提取出原样本中各个频域的丰富故障信息。相较于Transformer中原有的点积自注意力机制，DSA机制通过差分为每个特征求得对应的注意力权重向量，从而可从样本中提取出更为深层次的故障特征。MDT继承了Transformer对于处理序列数据的强大能力，可从时域信号中提取更为丰富的故障信息，同时避免了小样本模型中常见的过拟合问题。实验结果表明，该方法在每个故障种类仅有100个训练样本的轴承故障诊断任务中能稳定获得99%以上的测试准确率，具有强抗过拟合性和强鲁棒性。

关键词： 多头卷积; 差分自注意力; Transformer变种; 小样本; 故障诊断

本文引用格式

陈新度, 扶治森, 吴智恒, 等 . 基于多头卷积和差分自注意力的小样本故障诊断方法[J]. 华南理工大学学报(自然科学版), 2023 , 51(7) : 21 -33 . DOI: 10.12141/j.issn.1000-565X.220626

Abstract

Bearing is one of the most widely used rotating parts in industrial equipment. If the bearing runs in fault condition for a long time, it will cause huge economic loss and threaten personal safety, so that the investigation of bearing fault diagnosis is of great significance. Fault diagnosis technology based on deep learning is becoming more and more mature, but there are problems such as over-fitting, unstable effect and low accuracy in the case of small samples. In order to solve these problems, this paper proposes a Transformer variant model MDT (Multi-Head Convolution and Differential Self-Attention Transformer) to realize end-to-end few-shot fault diagnosis. This model combines the new data embedding algorithm of MC (Multi-Head Convolution) and the DSA (Differential Self-Attention) mechanism. The MC algorithm performs multi-path one-dimension convolution on the sample, extends the sample from one dimension to two dimensions by multi-channel output, and extracts rich fault information in each frequency domain in the original sample through multiple convolution kernel sizes. As compared with the original dot product self-attention in Transformer, the DSA mechanism obtains the corresponding attention weight vector for each feature through the difference, so as to extract deeper fault features from the sample. MDT inherits the powerful ability of Transformer to process sequence data, which can extract richer fault information from time-domain signals and avoid the overfitting problem common in small-sample models. Experimental results show that the proposed method can stably obtain more than 99% test accuracy in the bearing fault diagnosis task with only 100 training samples per fault type, and has strong anti-overfitting ability and strong robustness.

Key words： multi-head convolution; differential self-attention; Transformer variant; small sample; fault diagnosis

参考文献

1	赵玉成，陈荣华，马占国．旋转机械动力辨识与故障诊断技术［M］．徐州：中国矿业大学出版社，2008．
2	SHAO H， JIANG H， ZHANG X，et al ．Rolling bearing fault diagnosis using an optimization deep belief network［J］．Measurement Science and Technology，2015，26（11）：115002．
3	GOODFELLOW I， POUGET-ABADIE J， MIRZA M，et al ．Generative adversarial networks［J］．Communications of the ACM，2020，63（11）：139-144．
4	RADFORD A， METZ L， CHINTALA S ．Unsupervised representation learning with deep convolutional generative adversarial networks［DB/OL］．（2016-01-17）［2022-08-04］．．
5	GULRAJANI I， AHMED F， ARJOVSKY M，et al ．Improved training of Wasserstein GANs［C］∥NIPS’17：Proceedings of the 31st International Conference on Neural Information Processing Systems，2017．［S. l.］：［s. n.］，2017：5767-5777．
6	SHEN S， JIN G， GAO K，et al ．AE-GAN：adversarial eliminating with GAN［DB/OL］．（2017-09-26）［2022-08-04］．．
7	MIRZA M， OSINDERO S ．Conditional generative adversarial nets［DB/OL］．（2014-11-06）［2022-08-04］．．
8	SAUFI S R， AHMAD Z A B， LEONG M S，et al ．Gearbox fault diagnosis using a deep learning model with limited data sample［J］．IEEE Transactions on Industrial Informatics，2020，16（10）：6263-6271．
9	LI X， ZHANG W， DING Q ．Understanding and improving deep learning-based rolling bearing fault diagnosis with attention mechanism［J］．Signal Processing，2019，161：136-154．
10	ZHANG X， HE C， LU Y，et al ．Fault diagnosis for small samples based on attention mechanism［J］．Measurement，2022，187：110242．
11	XIE Z， CHEN J， FENG Y，et al ．End to end multi-task learning with attention for multi-objective fault diagnosis under small sample［J］．Journal of Manufacturing Systems，2022，62：301-316．
12	VASWANI A， SHAZEER N， PARMAR N，et al ．Attention is all you need［C］∥ Proceedings of the 31st Conference on Neural Information Processing Systems （NIPS 2017）．Long Beach：［s. n.］，2017．
13	SHAZEER N， MIRHOSEINI A， MAZIARZ K，et al ．Outrageously large neural networks：the sparsely-gated mixture-of-experts layer［DB/OL］．（2017-03-04）［2022-08-04］．．
14	DING Y， JIA M， MIAO Q，et al ．A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings［J］．Mechanical Systems and Signal Processing，2022，168：108616．
15	郑英，金淼，张洪，等．一种基于一维多路卷积神经网络的故障分类方法：CN110033021A［P］．2019-07-19．
16	DAUBECHIES I， LU J， WU H T ．Synchrosqueezed wavelet transforms：an empirical mode decomposition-like tool［J］．Applied and Computational Harmonic Analysis，2011，30（2）：243-261．
17	HUANG N E， SHEN Z， LONG S R，et al ．The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis［J］．Proceedings of the Royal Society of London. Series A：Mathematical，Physical and Engineering Sciences，1998，454（1971）：903-995．

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献