Publications
Detailed Information
Techniques for Ease of OpenCL Programming : OpenCL의 프로그래밍 용이성 향상 기법
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 이재진 | - |
dc.contributor.author | 김정현 | - |
dc.date.accessioned | 2017-07-13T07:14:37Z | - |
dc.date.available | 2017-07-13T07:14:37Z | - |
dc.date.issued | 2016-02 | - |
dc.identifier.other | 000000133064 | - |
dc.identifier.uri | https://hdl.handle.net/10371/119178 | - |
dc.description | 학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 이재진. | - |
dc.description.abstract | OpenCL is one of the major programming models for heterogeneous systems. This thesis presents two limitations of OpenCL, the complicated nature of programming in OpenCL and the lack of support for a heterogeneous cluster, and proposes a solution for each of them for ease of programming.
The first limitation is that it is complicated to write a program using OpenCL. In order to lower this programming complexity, this thesis proposes a framework that translates a program written in a high-level language (OpenMP) to OpenCL at the source level. This thesis achieves both ease of programming and high performance by employing two techniques | - |
dc.description.abstract | data transfer minimization (DTM) and performance portability enhancement (PPE). This thesis shows the effectiveness of the proposed translation framework by evaluating benchmark applications and the practicality by comparing it with the commercial PGI compiler.
The second limitation of OpenCL is the lack of support for a heterogeneous cluster. In order to extend OpenCL to a heterogeneous cluster, this thesis proposes a framework called SnuCL-D that is able to execute a program written only in OpenCL on a heterogeneous cluster. Unlike previous approaches that apply a centralized approach, the proposed framework applies a decentralized approach, which gives a chance to reduce three kinds of overhead occurring in the execution path of commands. With the ability to analyze and reduce three kinds of overhead, the proposed framework shows good scalability for a large-scale cluster system. The proposed framework proves its effectiveness and practicality by compared to the representative centralized approach (SnuCL) and MPI with benchmark applications. This thesis proposes solutions for the two limitations of OpenCL for ease of programming on heterogeneous clusters. It is expected that application developers will be able to easily execute not only an OpenMP program on various accelerators but also a program written only in OpenCL on a heterogeneous cluster. | - |
dc.description.tableofcontents | Chapter I. Introduction 1
I.1 Motivation and Objectives 5 I.1.1 Programming Complexity 5 I.1.2 Lack of Support for a Heterogeneous Cluster 8 I.2 Contributions 12 Chapter II. Background and Related Work 15 II.1 Background 15 II.1.1 OpenCL 16 II.1.2 OpenMP 23 II.2 Related Work 26 II.2.1 Programming Complexity 26 II.2.2 Support for a Heterogeneous Cluster 29 Chapter III. Lowering the Programming Complexity 34 III.1 Motivating Example 35 III.1.1 Device Constructs 35 III.1.2 Needs for Data Transfer Optimization 41 III.2 Mapping OpenMP to OpenCL 44 III.2.1 Architecture Model 44 III.2.2 Execution Model 45 III.3 Code Translation 46 III.3.1 Translation Process 46 III.3.2 Translating OpenMP to OpenCL 48 III.3.3 Example of Code Translation 50 III.3.4 Data Transfer Minimization (DTM) 62 III.3.5 Performance Portability Enhancement (PPE) 66 III.4 Performance Evaluation 69 III.4.1 Evaluation Methodology 70 III.4.2 Effectiveness of Optimization Techniques 74 III.4.3 Comparison with Other Implementations 79 Chapter IV. Support for a Heterogeneous Cluster 90 IV.1 Problems of Previous Approaches 90 IV.2 The Approach of SnuCL-D 91 IV.2.1 Overhead Analysis 93 IV.2.2 Remote Device Virtualization 94 IV.2.3 Redundant Computation and Data Replication 95 IV.2.4 Memory-read Commands 97 IV.3 Consistency Management 98 IV.4 Deterministic Command Scheduling 100 IV.5 New API Function: clAttachBufferToDevice() 103 IV.6 Queueing Optimization 104 IV.7 Performance Evaluation 105 IV.7.1 Evaluation Methodology 105 IV.7.2 Evaluation with a Microbenchmark 109 IV.7.3 Evaluation on the Large-scale CPU Cluster 111 IV.7.4 Evaluation on the Medium-scale GPU Cluster 123 Chapter V. Conclusion and Future Work 125 Bibliography 129 Korean Abstract 140 | - |
dc.format | application/pdf | - |
dc.format.extent | 2415353 bytes | - |
dc.format.medium | application/pdf | - |
dc.language.iso | en | - |
dc.publisher | 서울대학교 대학원 | - |
dc.subject | OpenMP | - |
dc.subject | OpenCL | - |
dc.subject | ease of programming | - |
dc.subject | high performance | - |
dc.subject | clusters | - |
dc.subject | heterogeneous systems | - |
dc.subject | programming model | - |
dc.subject | accelerators | - |
dc.subject | benchmarks | - |
dc.subject.ddc | 621 | - |
dc.title | Techniques for Ease of OpenCL Programming | - |
dc.title.alternative | OpenCL의 프로그래밍 용이성 향상 기법 | - |
dc.type | Thesis | - |
dc.description.degree | Doctor | - |
dc.citation.pages | 141 | - |
dc.contributor.affiliation | 공과대학 전기·컴퓨터공학부 | - |
dc.date.awarded | 2016-02 | - |
- Appears in Collections:
- Files in This Item:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.