收稿日期: 2014-04-15
修回日期: 2014-07-15
网络出版日期: 2014-07-01
基金资助
国家自然科学基金资助项目( 61174184) ; 广东省工业科技攻关计划项目( 2008B010200010) ; 广州市科技支撑项目( 2011J4300045)
Data Clustering of Road Transportation Information System Based on Attribute Dimension Partition and MapReduce
Received date: 2014-04-15
Revised date: 2014-07-15
Online published: 2014-07-01
Supported by
国家自然科学基金资助项目( 61174184) ; 广东省工业科技攻关计划项目( 2008B010200010) ; 广州市科技支撑项目( 2011J4300045)
郑晓峰 徐建闽 卢凯 . 基于属性维划分和MapReduce 的道路运输信息系统数据聚类[J]. 华南理工大学学报(自然科学版), 2014 , 42(8) : 122 -128,135 . DOI: 10.3969/j.issn.1000-565X.2014.08.019
Aiming at the shortcomings of DBSCAN ( Density-Based Spatial Clustering of Applications with Noise) ,this paper presents the concept of the attribute dimension partition by integrating the domain knowledge with thepartition idea.Then,the principles of the cluster merging and the pruning computation are demonstrated.Finally,an optimization method of DBSCAN is put forward based on the cloud computing programming model MapReduce,and the optimization method is verified through the data clustering of a real road transport information system.It isfound that the dataset partition helps to perform the concurrent computation,and the proposed optimization methodis superior to common statistical methods.
Key words: road transportation; DBSCAN; attribute dimension; partition; MapReduce; clustering
/
| 〈 |
|
〉 |