华南理工大学学报(自然科学版) ›› 2008, Vol. 36 ›› Issue (5): 123-127.

• 电子、通信与自动控制 • 上一篇    下一篇

基于粗糙集与支持向量机的分类算法

邓九英杜启亮毛宗源姚琛2   

  1. 1. 华南理工大学 自动化科学与工程学院, 广东 广州 510640;2. 广东教育学院 计算机科学系, 广东 广州 510303
  • 收稿日期:2007-10-22 修回日期:2008-01-29 出版日期:2008-05-25 发布日期:2008-05-25
  • 通信作者: 邓九英(1962-),女,访问学者,广东教育学院副教授,主要从事智能计算、数据挖掘方面的研究. E-mail:djy1111@126.com
  • 作者简介:邓九英(1962-),女,访问学者,广东教育学院副教授,主要从事智能计算、数据挖掘方面的研究.
  • 基金资助:

    国家自然科学基金资助项目(30570458)

Classification Algorithm Based on Rough Set and Support Vector

Deng Jiu-ying Du Qi-liang1  Mao Zong-yuan1  Yao Chen2   

  1. 1.School of Automation Science and Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China;2.Department of Computer Science,Guangdong Institute of Education,Guangzhou 510303,Guangdong,China
  • Received:2007-10-22 Revised:2008-01-29 Online:2008-05-25 Published:2008-05-25
  • Contact: 邓九英(1962-),女,访问学者,广东教育学院副教授,主要从事智能计算、数据挖掘方面的研究. E-mail:djy1111@126.com
  • About author:邓九英(1962-),女,访问学者,广东教育学院副教授,主要从事智能计算、数据挖掘方面的研究.
  • Supported by:

    国家自然科学基金资助项目(30570458)

摘要: 针对高维大样本环境下支持向量机训练算法面临的耗时增大与维数灾问题,将序列最小优化算法(SMO)与粗糙集(RS)的数据处理功能相结合,提出一种新的基于粗糙集与支持向量机的分类算法RS-SMO.该算法依据属性的重要性对数据集作属性约简,用粗糙边界集法生成类边界集作为SMO的训练子集,使训练集比原始训练集的维数与规模都有一定程度的减少,可构造出具有较好时空性能的算法.实验结果表明,RS-SMO算法能实现结构风险最小化,且性能优于SMO算法.

关键词: 粗糙集, 支持向量机, 分解算法, 属性约简, 边界集, 时空性能

Abstract:

When training the high-dimension and large-sample objectives,the support vector machine(SVM) may encounter the curse of dimensionality and may result in large time cost.In order to solve these problems,this paper presents a novel classification algorithm based on rough set and support vector machine(RS-SMO) by combining the sequence minimizing optimization(SMO) algorithm with the data processing function of a rough set.In this algorithm,data sets are attribute-reduced according to the attribute significance,and some class boundary sets are formed by using rough boundary set as the training subsets of SMO algorithm.Thus,the dimension and scale of the training set become less than both of the original sets,which helps to improve the time-space performance of the algorithm.Experimental results indicate that the proposed RS-SMO algorithm minimizes the structural risk and is superior to the SMO algorithm in its performance.

Key words: rough set, support vector machine, decomposing algorithm, attribute reduction, boundary set, time-space performance