Publications
Detailed Information
A Machine Learning-based Methodology to Detect I/O Performance Bottlenecks for Hadoop Systems
Cited 0 time in
Web of Science
Cited 0 time in Scopus
- Authors
- Advisor
- 염헌영
- Major
- 공과대학 전기·컴퓨터공학부
- Issue Date
- 2014-02
- Publisher
- 서울대학교 대학원
- Keywords
- MapReduce ; Hadoop ; I/O Performance Bottleneck Detection ; Monitoring ; Machine Learning
- Description
- 학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 염헌영.
- Abstract
- As distributed systems such as clusters and clouds for processing big data grow in scale these days, detecting I/O performance bottlenecks is one of the biggest challenges in achieving high performance. A set of extremely slow straggler tasks may be a direct cause for the bottlenecks, which can degrade the overall performance of Hadoop systems. Furthermore, due to different kinds of bottleneck, the efficiency in related resource usage and energy consumption may decrease. In most cases, users have little idea about the performance of which task is degraded and why. To address this problem, we have developed an I/O performance bottleneck detection methodology for Hadoop systems. There are two key aspects in our methodology. First, I/O profiling is performed per Hadoop task in order to extract feature values that may be related to performance degradation. Then all feature value sets for all Hadoop tasks are analyzed by using the Machine Learning technique. As a result, the most relevant multiple features among all features can be selected and low performance Hadoop tasks are identified as performance bottlenecks. Second, it is possible to provide performance improvement guidelines such as use of resource scheduling alternatives based on the result of using our methodology. We have found out that use of our methodology may lead to up to about 37% performance enhancements in a scalable environment based on the identification of the performance bottlenecks.
- Language
- English
- Files in This Item:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.