Publications

Detailed Information

Techniques for Ease of OpenCL Programming : OpenCL의 프로그래밍 용이성 향상 기법

DC Field Value Language
dc.contributor.advisor이재진-
dc.contributor.author김정현-
dc.date.accessioned2017-07-13T07:14:37Z-
dc.date.available2017-07-13T07:14:37Z-
dc.date.issued2016-02-
dc.identifier.other000000133064-
dc.identifier.urihttps://hdl.handle.net/10371/119178-
dc.description학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 이재진.-
dc.description.abstractOpenCL is one of the major programming models for heterogeneous systems. This thesis presents two limitations of OpenCL, the complicated nature of programming in OpenCL and the lack of support for a heterogeneous cluster, and proposes a solution for each of them for ease of programming.

The first limitation is that it is complicated to write a program using OpenCL. In order to lower this programming complexity, this thesis proposes a framework that translates a program written in a high-level language (OpenMP) to OpenCL at the source level. This thesis achieves both ease of programming and high performance by employing two techniques
-
dc.description.abstractdata transfer minimization (DTM) and performance portability enhancement (PPE). This thesis shows the effectiveness of the proposed translation framework by evaluating benchmark applications and the practicality by comparing it with the commercial PGI compiler.

The second limitation of OpenCL is the lack of support for a heterogeneous cluster. In order to extend OpenCL to a heterogeneous cluster, this thesis proposes a framework called SnuCL-D that is able to execute a program written only in OpenCL on a heterogeneous cluster. Unlike previous approaches that apply a centralized approach, the proposed framework applies a decentralized approach, which gives a chance to reduce three kinds of overhead occurring in the execution path of commands.
With the ability to analyze and reduce three kinds of overhead, the proposed framework shows good scalability for a large-scale cluster system. The proposed framework proves its effectiveness and practicality by compared to the representative centralized approach (SnuCL) and MPI with benchmark applications.

This thesis proposes solutions for the two limitations of OpenCL for ease of programming on heterogeneous clusters. It is expected that application developers will be able to easily execute not only an OpenMP program on various accelerators but also a program written only in OpenCL on a heterogeneous cluster.
-
dc.description.tableofcontentsChapter I. Introduction 1
I.1 Motivation and Objectives 5
I.1.1 Programming Complexity 5
I.1.2 Lack of Support for a Heterogeneous Cluster 8
I.2 Contributions 12

Chapter II. Background and Related Work 15
II.1 Background 15
II.1.1 OpenCL 16
II.1.2 OpenMP 23
II.2 Related Work 26
II.2.1 Programming Complexity 26
II.2.2 Support for a Heterogeneous Cluster 29

Chapter III. Lowering the Programming Complexity 34
III.1 Motivating Example 35
III.1.1 Device Constructs 35
III.1.2 Needs for Data Transfer Optimization 41
III.2 Mapping OpenMP to OpenCL 44
III.2.1 Architecture Model 44
III.2.2 Execution Model 45
III.3 Code Translation 46
III.3.1 Translation Process 46
III.3.2 Translating OpenMP to OpenCL 48
III.3.3 Example of Code Translation 50
III.3.4 Data Transfer Minimization (DTM) 62
III.3.5 Performance Portability Enhancement (PPE) 66
III.4 Performance Evaluation 69
III.4.1 Evaluation Methodology 70
III.4.2 Effectiveness of Optimization Techniques 74
III.4.3 Comparison with Other Implementations 79

Chapter IV. Support for a Heterogeneous Cluster 90
IV.1 Problems of Previous Approaches 90
IV.2 The Approach of SnuCL-D 91
IV.2.1 Overhead Analysis 93
IV.2.2 Remote Device Virtualization 94
IV.2.3 Redundant Computation and Data Replication 95
IV.2.4 Memory-read Commands 97
IV.3 Consistency Management 98
IV.4 Deterministic Command Scheduling 100
IV.5 New API Function: clAttachBufferToDevice() 103
IV.6 Queueing Optimization 104
IV.7 Performance Evaluation 105
IV.7.1 Evaluation Methodology 105
IV.7.2 Evaluation with a Microbenchmark 109
IV.7.3 Evaluation on the Large-scale CPU Cluster 111
IV.7.4 Evaluation on the Medium-scale GPU Cluster 123

Chapter V. Conclusion and Future Work 125

Bibliography 129

Korean Abstract 140
-
dc.formatapplication/pdf-
dc.format.extent2415353 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectOpenMP-
dc.subjectOpenCL-
dc.subjectease of programming-
dc.subjecthigh performance-
dc.subjectclusters-
dc.subjectheterogeneous systems-
dc.subjectprogramming model-
dc.subjectaccelerators-
dc.subjectbenchmarks-
dc.subject.ddc621-
dc.titleTechniques for Ease of OpenCL Programming-
dc.title.alternativeOpenCL의 프로그래밍 용이성 향상 기법-
dc.typeThesis-
dc.description.degreeDoctor-
dc.citation.pages141-
dc.contributor.affiliation공과대학 전기·컴퓨터공학부-
dc.date.awarded2016-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share