基于多头卷积和差分自注意力的小样本故障诊断方法

doi:10.12141/j.issn.1000-565X.220626

华南理工大学学报(自然科学版) ›› 2023, Vol. 51 ›› Issue (7): 21-33.doi: 10.12141/j.issn.1000-565X.220626

所属专题： 2023年机械工程

基于多头卷积和差分自注意力的小样本故障诊断方法

陈新度¹ 扶治森¹^,²^,³ 吴智恒²^,³ 陈启愉²^,³ 郭伟科²^,³

^1.广东工业大学机电工程学院, 广东广州 510006
^2.广东省科学院智能制造研究所, 广东广州 510030
^3.广东省现代控制技术重点实验室, 广东广州 510030

收稿日期:2022-09-26 出版日期:2023-07-25 发布日期:2023-02-20
通信作者: 陈新度（1967-），男，博士，教授，博士生导师，主要从事智能装备、制造系统建模以及仿真优化等研究。 E-mail:chenxindu@gdut.edu.cn
作者简介:陈新度（1967-），男，博士，教授，博士生导师，主要从事智能装备、制造系统建模以及仿真优化等研究。
基金资助:
广东省重点领域研发计划项目(2019B090917004);广州市重点研发计划项目(202206030006);广州市黄埔区国际科技合作项目(2021GH13)

Small-Sample Fault Diagnosis Method Based on Multi-Head Convolution and Differential Self-Attention

CHEN Xindu¹ FU Zhisen¹^,²^,³ WU Zhiheng²^,³ CHEN Qiyu²^,³ GUO Weike²^,³

^1.School of Mechanical and Electrical Engineering，Guangdong University of Technology，Guangzhou 510006，Guangdong，China
^2.Intelligent Manufacturing Research Institute，Guangdong Academy of Sciences，Guangzhou 510030，Guangdong，China
^3.Guangdong Provincial Key Laboratory of Modern Control Technology，Guangzhou 510030，Guangdong，China

Received:2022-09-26 Online:2023-07-25 Published:2023-02-20
Contact: 陈新度（1967-），男，博士，教授，博士生导师，主要从事智能装备、制造系统建模以及仿真优化等研究。 E-mail:chenxindu@gdut.edu.cn
About author:陈新度（1967-），男，博士，教授，博士生导师，主要从事智能装备、制造系统建模以及仿真优化等研究。
Supported by:
the Guangdong Province Key Field R&D Program(2019B090917004)

摘要/Abstract

摘要：

轴承是工业设备中使用最广泛的旋转部件之一，如果轴承在故障状况下运行较长时间，将会造成巨大的经济损失并威胁人身安全，因此，对轴承故障诊断进行研究具有十分重要的意义。基于深度学习的故障诊断技术目前日趋成熟，但在小样本情况下存在过拟合、效果不稳定、准确率不高等问题。为了解决这类问题，文中提出了一种融合多头卷积（Multi-Head Convolution，MC）的数据嵌入新算法和差分自注意力（Differential Self-Attention，DSA）机制的Transformer变种模型MDT（Multi-Head Convolution and Differential Self-Attention Transformer），以实现端到端的小样本故障诊断。MC算法对样本进行多路径一维卷积，由多通道输出将样本从一维扩展到二维，通过多个卷积核尺寸提取出原样本中各个频域的丰富故障信息。相较于Transformer中原有的点积自注意力机制，DSA机制通过差分为每个特征求得对应的注意力权重向量，从而可从样本中提取出更为深层次的故障特征。MDT继承了Transformer对于处理序列数据的强大能力，可从时域信号中提取更为丰富的故障信息，同时避免了小样本模型中常见的过拟合问题。实验结果表明，该方法在每个故障种类仅有100个训练样本的轴承故障诊断任务中能稳定获得99%以上的测试准确率，具有强抗过拟合性和强鲁棒性。

关键词: 多头卷积, 差分自注意力, Transformer变种, 小样本, 故障诊断

Abstract:

Bearing is one of the most widely used rotating parts in industrial equipment. If the bearing runs in fault condition for a long time, it will cause huge economic loss and threaten personal safety, so that the investigation of bearing fault diagnosis is of great significance. Fault diagnosis technology based on deep learning is becoming more and more mature, but there are problems such as over-fitting, unstable effect and low accuracy in the case of small samples. In order to solve these problems, this paper proposes a Transformer variant model MDT (Multi-Head Convolution and Differential Self-Attention Transformer) to realize end-to-end few-shot fault diagnosis. This model combines the new data embedding algorithm of MC (Multi-Head Convolution) and the DSA (Differential Self-Attention) mechanism. The MC algorithm performs multi-path one-dimension convolution on the sample, extends the sample from one dimension to two dimensions by multi-channel output, and extracts rich fault information in each frequency domain in the original sample through multiple convolution kernel sizes. As compared with the original dot product self-attention in Transformer, the DSA mechanism obtains the corresponding attention weight vector for each feature through the difference, so as to extract deeper fault features from the sample. MDT inherits the powerful ability of Transformer to process sequence data, which can extract richer fault information from time-domain signals and avoid the overfitting problem common in small-sample models. Experimental results show that the proposed method can stably obtain more than 99% test accuracy in the bearing fault diagnosis task with only 100 training samples per fault type, and has strong anti-overfitting ability and strong robustness.

Key words: multi-head convolution, differential self-attention, Transformer variant, small sample, fault diagnosis

中图分类号:

TH133.3

陈新度, 扶治森, 吴智恒, 等. 基于多头卷积和差分自注意力的小样本故障诊断方法[J]. 华南理工大学学报(自然科学版), 2023, 51(7): 21-33.

CHEN Xindu, FU Zhisen, WU Zhiheng, et al.. Small-Sample Fault Diagnosis Method Based on Multi-Head Convolution and Differential Self-Attention[J]. Journal of South China University of Technology(Natural Science Edition), 2023, 51(7): 21-33.

图/表 23

图1

图2

图3

图4

图5

图6

图7

图8

表1

图9

表2

表3

表4

表5

表6

图10

图11

图12

图13

图14

图15

表7

表8

参考文献 17

1	赵玉成，陈荣华，马占国．旋转机械动力辨识与故障诊断技术［M］．徐州：中国矿业大学出版社，2008．
2	SHAO H， JIANG H， ZHANG X，et al ．Rolling bearing fault diagnosis using an optimization deep belief network［J］．Measurement Science and Technology，2015，26（11）：115002．
3	GOODFELLOW I， POUGET-ABADIE J， MIRZA M，et al ．Generative adversarial networks［J］．Communications of the ACM，2020，63（11）：139-144．
4	RADFORD A， METZ L， CHINTALA S ．Unsupervised representation learning with deep convolutional generative adversarial networks［DB/OL］．（2016-01-17）［2022-08-04］．．
5	GULRAJANI I， AHMED F， ARJOVSKY M，et al ．Improved training of Wasserstein GANs［C］∥NIPS’17：Proceedings of the 31st International Conference on Neural Information Processing Systems，2017．［S. l.］：［s. n.］，2017：5767-5777．
6	SHEN S， JIN G， GAO K，et al ．AE-GAN：adversarial eliminating with GAN［DB/OL］．（2017-09-26）［2022-08-04］．．
7	MIRZA M， OSINDERO S ．Conditional generative adversarial nets［DB/OL］．（2014-11-06）［2022-08-04］．．
8	SAUFI S R， AHMAD Z A B， LEONG M S，et al ．Gearbox fault diagnosis using a deep learning model with limited data sample［J］．IEEE Transactions on Industrial Informatics，2020，16（10）：6263-6271．
9	LI X， ZHANG W， DING Q ．Understanding and improving deep learning-based rolling bearing fault diagnosis with attention mechanism［J］．Signal Processing，2019，161：136-154．
10	ZHANG X， HE C， LU Y，et al ．Fault diagnosis for small samples based on attention mechanism［J］．Measurement，2022，187：110242．
11	XIE Z， CHEN J， FENG Y，et al ．End to end multi-task learning with attention for multi-objective fault diagnosis under small sample［J］．Journal of Manufacturing Systems，2022，62：301-316．
12	VASWANI A， SHAZEER N， PARMAR N，et al ．Attention is all you need［C］∥ Proceedings of the 31st Conference on Neural Information Processing Systems （NIPS 2017）．Long Beach：［s. n.］，2017．
13	SHAZEER N， MIRHOSEINI A， MAZIARZ K，et al ．Outrageously large neural networks：the sparsely-gated mixture-of-experts layer［DB/OL］．（2017-03-04）［2022-08-04］．．
14	DING Y， JIA M， MIAO Q，et al ．A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings［J］．Mechanical Systems and Signal Processing，2022，168：108616．
15	郑英，金淼，张洪，等．一种基于一维多路卷积神经网络的故障分类方法：CN110033021A［P］．2019-07-19．
16	DAUBECHIES I， LU J， WU H T ．Synchrosqueezed wavelet transforms：an empirical mode decomposition-like tool［J］．Applied and Computational Harmonic Analysis，2011，30（2）：243-261．
17	HUANG N E， SHEN Z， LONG S R，et al ．The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis［J］．Proceedings of the Royal Society of London. Series A：Mathematical，Physical and Engineering Sciences，1998，454（1971）：903-995．

编号	头数	卷积核大小	卷积层数	步长	输出通道	训练准确率/%	测试准确率/%
1	2	21，15	3	1，1，1	80，70，48	96.00	94.45
2	2	21，11	4	2，2，2，2	80，70，48，48	96.80	96.02
3	2	21，7	4	4，2，2，2	80，70，48，48	97.10	95.73
4	3	21，15，11	3	1，1，1	80，70，32	99.14	98.85
5	3	21，15，7	4	2，2，2，2	80，70，32，32	99.83	99.60
6	3	21，15，11	4	4，2，2，2	80，70，32，32	99.70	99.78
7	4	21，15，11，7	3	1，1，1	80，70，24	98.90	98.54
8	4	21，15，11，7	4	2，2，2，2	80，70，24，24	99.37	99.21
9	4	21，15，11，7	4	4，2，2，2	80，70，24，24	99.28	99.46

嵌入法	训练准确率/%	测试准确率/%
SSWT	69.95	65.32
EMD	63.88	67.67
多头卷积嵌入法	99.70	99.78

实验编号	头数	块数	批量大小	样本长度	嵌入维度	训练准确率/%	测试准确率/%
1	3	2	150	1 024	64	95.97	94.51
2	3	4	100	2 480	96	98.86	96.72
3	3	3	150	4 960	96	97.98	96.90
4	4	2	150	1 024	64	99.20	95.84
5	4	4	100	2 480	96	99.43	98.40
6	4	3	150	4 960	96	100.00	99.78
7	5	2	150	1 024	64	97.50	89.37
8	5	4	100	2 480	96	97.99	90.61
9	5	3	150	4 960	96	98.91	92.46

超参数	取值		取值
批量大小	150	编码器堆叠数目	3
初始学习率	0.001	Dropout	0.1
权重衰减	0.000 1	卷积头数	3
样本长度	4 960	卷积核尺寸	21，15，11
训练数据尺寸	［156，96］	分类器神经元数	60，10
注意力头数	4

模型	平均准确率/%	最大准确率/%	方差	耗时/s
MDT	99.79	99.92	1.55×10^-6	881.4
DNN	84.17	96.21	9.13×10^-3	430.8
ResNet-CNN	89.49	97.06	1.32×10^-4	246.7
GRU	90.86	93.94	5.25×10^-3	369.3

基于多头卷积和差分自注意力的小样本故障诊断方法

Small-Sample Fault Diagnosis Method Based on Multi-Head Convolution and Differential Self-Attention

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 23

参考文献 17

相关文章 15

编辑推荐

Metrics

本文评价

SNR/dB	准确率/%
SNR/dB	MDT	ResNet-CNN	DNN	GRU
-50	55.04	48.93	39.67	41.32
-20	59.63	49.09	41.82	44.81
-10	81.49	49.49	45.77	50.74
-8	87.20	49.14	48.60	52.58
-6	90.81	51.34	44.85	69.95
-4	94.95	52.73	52.97	69.09
-2	94.09	59.81	55.74	74.26
0	97.82	65.85	59.70	81.04
2	98.45	69.13	60.78	84.25
8	99.00	83.60	81.63	86.35
6	99.19	77.41	73.83	85.36
10	99.54	86.24	82.16	87.07

组别	准确率/%
组别	点积训练	点积测试	差分训练	差分测试
1	24.98	26.66	80.10	83.43
2	92.81	93.98	92.74	93.10
3	95.20	79.38	88.60	88.23

[1]	赵荣超, 吴百礼, 陈祝云, 温楷儒, 张绍辉, 李巍华. 多尺度时空信息融合驱动的图神经网络故障诊断方法[J]. 华南理工大学学报(自然科学版), 2023, 51(12): 42-52.
[2]	陈忠, 唐鑫, 张大明, 何东山, 张宪民. S变换子域适应的扶梯电机轴承迁移诊断[J]. 华南理工大学学报(自然科学版), 2023, 51(12): 34-41.
[3]	刘乙奇, 黄志鹏, 于广平, 等. 全生命周期污泥膨胀的智能检测和诊断分析[J]. 华南理工大学学报(自然科学版), 2022, 50(6): 91-99,110.
[4]	张艳, 吴洛天, 王年, 等. 基于多模块关系网络的2D足迹分类[J]. 华南理工大学学报（自然科学版）, 2021, 49(6): 66-76.
[5]	唐宏宾, 傅政, 邓习树, 等. 工程机械柱塞泵变载荷工况故障诊断方法[J]. 华南理工大学学报（自然科学版）, 2021, 49(2): 110-119.
[6]	郭明军, 李伟光, 杨期江, 等. PCA 的幅值滤波特性及在转子特征提取中的应用[J]. 华南理工大学学报（自然科学版）, 2020, 48(5): 125-133.
[7]	郭明军, 李伟光, 杨期江, 等. 基于稀疏算法的大型转子多工况轴心轨迹提纯[J]. 华南理工大学学报（自然科学版）, 2020, 48(4): 45-53.
[8]	程洪超, 吴菁, 刘乙奇, 等. 面向污水处理过程的预测元-RVM 故障诊断建模[J]. 华南理工大学学报（自然科学版）, 2020, 48(3): 10-17.
[9]	周璇, 王晓佩, 梁列全, 等. 基于随机森林算法的制冷剂充注量故障诊断 [J]. 华南理工大学学报（自然科学版）, 2020, 48(2): 16-24.
[10]	许玉格赖春伶罗飞. 基于不平衡分类的Bagging集成污水处理故障诊断[J]. 华南理工大学学报（自然科学版）, 2018, 46(8): 107-115.
[11]	戴屹梅张和生李东齐红梅. 基于时幅拐点的突发信号特征频率捕获算法[J]. 华南理工大学学报(自然科学版), 2018, 46(3): 142-148.
[12]	赵德尊李建勇程卫东. 齿轮噪声干扰及变转速下的滚动轴承故障诊断[J]. 华南理工大学学报（自然科学版）, 2016, 44(2): 67-73.
[13]	肖红军刘乙奇黄道平. 高斯过程建模方法在工业过程中的应用[J]. 华南理工大学学报（自然科学版）, 2016, 44(12): 36-43,52.
[14]	黄道平邱禹刘乙奇李艳. 面向污水处理的数据驱动故障诊断及预测方法综述[J]. 华南理工大学学报（自然科学版）, 2015, 43(3): 111-120,129.
[15]	何国林丁康李林生邓仁刚. 双弹性支撑的风电机组传动链振动测试与分析[J]. 华南理工大学学报（自然科学版）, 2014, 42(3): 90-97.