Publications

Detailed Information

A Machine Learning-based Methodology to Detect I/O Performance Bottlenecks for Hadoop Systems

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

성민영

Advisor
염헌영
Major
공과대학 전기·컴퓨터공학부
Issue Date
2014-02
Publisher
서울대학교 대학원
Keywords
MapReduceHadoopI/O Performance Bottleneck DetectionMonitoringMachine Learning
Description
학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 염헌영.
Abstract
As distributed systems such as clusters and clouds for processing big data grow in scale these days, detecting I/O performance bottlenecks is one of the biggest challenges in achieving high performance. A set of extremely slow straggler tasks may be a direct cause for the bottlenecks, which can degrade the overall performance of Hadoop systems. Furthermore, due to different kinds of bottleneck, the efficiency in related resource usage and energy consumption may decrease. In most cases, users have little idea about the performance of which task is degraded and why. To address this problem, we have developed an I/O performance bottleneck detection methodology for Hadoop systems. There are two key aspects in our methodology. First, I/O profiling is performed per Hadoop task in order to extract feature values that may be related to performance degradation. Then all feature value sets for all Hadoop tasks are analyzed by using the Machine Learning technique. As a result, the most relevant multiple features among all features can be selected and low performance Hadoop tasks are identified as performance bottlenecks. Second, it is possible to provide performance improvement guidelines such as use of resource scheduling alternatives based on the result of using our methodology. We have found out that use of our methodology may lead to up to about 37% performance enhancements in a scalable environment based on the identification of the performance bottlenecks.
Language
English
URI
https://hdl.handle.net/10371/123033
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share