Journal of South China University of Technology (Natural Science Edition) ›› 2013, Vol. 41 ›› Issue (12): 101-106.doi: 10.3969/j.issn.1000-565X.2013.12.017

• Biological Engineering • Previous Articles     Next Articles

Modularity- Based Graph Clustering for Analysis of Gene Microarray Data

Li Li Cao Yi- cheng Mao Xiao- fan   

  1. School of Biological Science and Engineering,South China University of Technology,Guangzhou 510006,Guangdong,China
  • Received:2013-04-22 Revised:2013-09-27 Online:2013-12-25 Published:2013-11-19
  • Contact: 曹以诚(1949-),男,教授,博士生导师,主要从事微分子生物学和生物信息学研究. E-mail:yccao@scut.edu.cn
  • About author:李力(1981-),男,博士生,主要从事基因芯片与生物信息学研究.E-mail:ottolear@gmail.com
  • Supported by:

    教育部中国网格计划生物信息网格平台子项目(B12137040130)

Abstract:

As an important clustering algorithm,graph clustering can be effectively applied to protein interactionnetworks and microarray data clustering.In this paper,to overcome the shortcomings of the existing graph cluste-ring methods for gene microarray data,a global graph clustering method based on the modularity and the subgraphsmoothness is proposed.In this algorithm,subgraph smoothness is introduced to avoid the local optimal solution,subgraphs with low smoothness values in the clustering results are split into singletons,and those newly- generatedsingletons are used in the next clustering step.After several iterations,the global optimal clustering result can beobtained.The proposed method is then compared with four commonly- used clustering methods (the classic graphclustering,the k- means algorithm,the SOM algorithm,and the Fuzzy algorithm) on a group of genome expressiondata,and the results show that (1) the proposed method is superior to the other four methods in terms of averagenon- overlap proportion and FOM' value; (2) when the dataset is divided into 39 clusters,the FOM' value of theproposed method is respectively 28.41%,19.21%,9.84% and 24.67% lower than those of the other four me-thods; and (3) the proposed method is of a classification accuracy,which is higher than that of the Fuzzy algorithmand the SOM algorithm,with an execution efficiency similar to that of the SOM algorithm but 5.94% higher thanthat of the Fuzzy method.

Key words: gene microarray, graph clustering, modularity, smoothness, algorithm

CLC Number: