S-Space College of Engineering/Engineering Practice School (공과대학/대학원) Dept. of Computer Science and Engineering (컴퓨터공학부) Theses (Master's Degree_컴퓨터공학부)
Design and Implementation of a Flexible and Extensible Data Processing Runtime
- 공과대학 컴퓨터공학부
- Issue Date
- 서울대학교 대학원
- Data Processing; Data Processing Framework; Data Analytics; Data Analytics Framework; Data Processing Engine; Data Analytics Engine
- 학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2018. 2. 전병곤.
- Today's data analytics applications take a wide variety of characteristics. They are also executed in various resource environments, with many distinct requirements. To face these requirements, many systems have been developed with optimization techniques that are suitable for each system's needs. However, the field of data processing is continuously growing with diverse requirements for job characteristics and resource environments. With current system designs which demonstrate pre-defined runtime behaviors, it is extremely difficult to apply new optimization techniques to them. Onyx is a system that approaches to solve this problem by designing and implementing a flexible and extensible execution runtime. The Onyx execution runtime is designed and implemented around the execution properties that must be flexibly controllable and extensible in order for jobs to be executed under the desired runtime behaviors. It uses a user configurable job representation, Onyx IR, annotated with execution properties which control the underlying runtime behaviors for each job to flexibly execute jobs according to users' requirements. Examples and evaluations show that new optimization techniques are easily applicable to Onyx, which otherwise require a significant amount of engineering effort using current data processing systems.