Journal of South China University of Technology (Natural Science Edition) ›› 2013, Vol. 41 ›› Issue (7): 137-144.doi: 10.3969/j.issn.1000-565X.2013.07.023

• Computer Science & Technology • Previous Articles    

Random Subspace- Based Semi- Supervised Dimensionality Reduction for Cancer Classification

Wen Gui- hua1 Cai Xian- fa1,2,3† Wei Jia1   

  1. 1.School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China;2.School of Medical Information Engineering,Guangdong Pharmaceutical University,Guangzhou 510006,Guangdong,China;3.Shenzhen Key Laboratory of High Performance Data Mining,Shenzhen 518055,Guangdong,China
  • Received:2013-03-20 Online:2013-07-25 Published:2013-06-01
  • Contact: Cai Xian-fa(born in 1979),male,on-job Ph.D.candidate,lecturer in Guangdong Pharmaceutical Uni-versity,mainly researches on pattern recognition and bioinformatics. E-mail:cxianfa@126.com
  • About author:Wen Gui-hua(born in 1968),male,professor,Ph.D.tutor,mainly reasearches on machine learning,knowledgediscovery and cognitive geometry.E-mail:crghwen@scut.edu.cn
  • Supported by:

    Supported by National Natural Science Foundation of China (61273363,61070090,61003174,60973083

Abstract:

Precise cancer classification is essential to the successful diagnosis and treatment of cancers.Al-though semi- supervised dimensionality reduction approaches perform very well on clean data sets,the topology of the neighborhood constructed with most existing approaches is unstable in the presence of noise.In order to solve this problem,a novel random subspace- based semi- supervised dimensionality reduction algorithm marked as RSSSDR,which combines the random subspace with the semi- supervised dimensionality reduction,is pro-posed.In this algorithm,first,multiple diverse graphs are designed in different random subspaces of data sets and are then combined to form a mixture graph on which dimensionality reduction is performed.Subsequently,the edge weights of neighborhood graph are determined through minimizing the local reconstruction error,such that the global geometric structure of data can be preserved without changing the local geometric structure.Ex-perimental results on public cancer data sets demonstrate that the proposed RSSSDR algorithm is of high classifi-cation accuracy and strong robustness.

Key words: semi- supervised learning, random subspace, cancer classification, dimensionality reduction