收稿日期: 2015-11-25
修回日期: 2016-09-13
网络出版日期: 2016-12-01
基金资助
国家科技支撑计划项目( 2012BAH18B05) ; 国家自然科学基金资助项目( 61272447)
MapReduce Job Performance Tuning by Optimizing Memory Configurations
Received date: 2015-11-25
Revised date: 2016-09-13
Online published: 2016-12-01
Supported by
Supported by the National Science and Technology Support Planning Program of China( 2012BAH18B05) and the National Natural Science Foundation of China( 61272447)
罗永刚 陈兴蜀 杨露 . 基于内存优化配置的MapReduce 性能调优[J]. 华南理工大学学报(自然科学版), 2017 , 45(1) : 102 -111 . DOI: 10.3969/j.issn.1000-565X.2017.01.015
MapReduce job performance depends heavily on memory configurations.In order to overcome the difficulty in predicting the memory requirement of MapReduce jobs,on the basis of the fact that Java Virtual Machine ( JVM) divides the heap space managed by JVM Garbage Collector into young and old generations,a generational memory prediction method is proposed.In the method,first,a regression model to resolve average garbage collection time for a given young generation size is constructed.Then,the problem of looking for the rational size of young generation is converted into a constrained nonlinear optimization problem,and a fixed-size search algorithm is designed to solve the optimization problem.Moreover,memory models of the Map and Reduce tasks of MapReduce jobs are constructed to solve the memory requirement of optimal performance,thus obtaining reasonable old generation memory size of the Map and Reduce tasks.Finally,a k-means clustering algorithm is used to predict the value of parameter PretenureSizeThreshold,and JVM configurations are tuned to reduce garbage collection pause time.Experimental results show that the proposed method can accurately predict the memory requirements of the Map and Reduce tasks of MapReduce jobs,and it can significantly improve job performance.
Key words: big data; MapReduce; garbage collection; memory allocation; performance tuning
/
| 〈 |
|
〉 |