华南理工大学学报(自然科学版) ›› 2018, Vol. 46 ›› Issue (8): 107-115.doi: 10.3969/j.issn.1000-565X.2018.08.015

• 计算机科学与技术 • 上一篇    下一篇

基于不平衡分类的Bagging集成污水处理故障诊断

许玉格1,赖春伶1,罗飞2   

  1. 1. 华南理工大学自动化科学与工程学院
    2. 华南理工大学自动化学院
  • 收稿日期:2017-12-26 修回日期:2018-05-02 出版日期:2018-08-25 发布日期:2018-07-01
  • 通信作者: 许玉格(1978-),女,博士,副教授,主要从事机器学习和数据挖掘研究 E-mail:202738@qq.com
  • 作者简介:许玉格(1978-),女,博士,副教授,主要从事机器学习和数据挖掘研究
  • 基金资助:
    广东省科技计划项目;国家自然科学基金项目;广东省科技计划项目;广东省科技计划项目;广州市科技计划项目

Bagging Ensemble Fault Diagnosis Modeling with Imbalanced classification in Wastewater Treatment Plant

 XU Yuge LAI Chunling LUO Fei    

  1. School of Automation Science and Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2017-12-26 Revised:2018-05-02 Online:2018-08-25 Published:2018-07-01
  • Contact: 许玉格(1978-),女,博士,副教授,主要从事机器学习和数据挖掘研究 E-mail:202738@qq.com
  • About author:许玉格(1978-),女,博士,副教授,主要从事机器学习和数据挖掘研究
  • Supported by:
    Supported by the National Natural Science Foundation of China(61473121) and the Science and Technology Planning Project of Guangdong Province(2016A020221008, 2017B010117007, 2017B090910011

摘要: 污水处理过程故障会导致出水水质下降、运行费用增高甚至造成环境的二次污染,而污水处理故障诊断数据的典型不平衡特性,严重影响了故障诊断的效果,尤其会导致故障类别的正确率偏低。针对此问题,提出一种基于加权极限学习机的改进型Bagging集成污水处理故障诊断建模方法。以加权极限学习机为基分类器,以Bagging集成框架建立集成分类器。定义可调整的过采样倍率公式,通过SMOTE方法对少数类样本进行过采样,保证基分类器间的多样性。以不平衡分类性能指标G-mean值为基础,定义新的基分类器输出权值更新公式,提高故障类别识别率。仿真实验表明,该污水处理故障诊断模型的性能优于其他对比算法,可有效提高G-mean值和整体分类正确率,特别是提高了故障类别的识别正确率。

关键词: 不平衡分类, 加权极限学习机, Bagging 集成算法, 污水处理, 故障诊断 

Abstract: Operation faults in wastewater treatment plant may lead to reduce effluent water quality, raise operation cost and secondary environmental pollution. The representative imbalanced data for fault diagnosis in wastewater treatment process seriously affects the fault diagnosis performance, especially results in the accuracy of faulty classes lower. To address this problem, this paper proposes an improved Bagging ensemble fault diagnosis method based on weighted extreme learning machine in wastewater treatment process. This method establishes the ensemble classifier in Bagging framework, and the weight extreme learning machine algorithm is selected to build the basic classifiers. Defining adjustable over_sampling rate?formula,the diversity of basic classifiers?is?ensured?by?over_sampling?the?minority?data with?SMOTE method.Based on the imbalance classification performance?index?G_mean, a updating formula of the output weight value in the base classifier is defined to improve the recognition accuracy in?faulty?class. Simulation experiments show that the proposed fault diagnosis model over performs the other algorithms. The proposed method can effectively improve G-mean value and overall classification accuracy on fault diagnosis in wastewater treatment process, in particular raise the recognition accuracy in faulty class.

Key words:  imbalanced classification, weighted extreme learning machine, bagging ensemble learning, wastewater treatment, fault diagnosis

中图分类号: