Browse

Implementing general matrix-matrix multiplication algorithm on the Intel Xeon Phi Knights Landing Processor
Intel Xeon Phi Knights Landing 프로세서에서의 일반 행렬 곱셈 알고리즘 구현

DC Field Value Language
dc.contributor.advisor신동우-
dc.contributor.author김래현-
dc.date.accessioned2018-05-29T05:07:52Z-
dc.date.available2018-05-29T05:07:52Z-
dc.date.issued2018-02-
dc.identifier.other000000149726-
dc.identifier.urihttps://hdl.handle.net/10371/142452-
dc.description학위논문 (석사)-- 서울대학교 대학원 : 자연과학대학 수리과학부, 2018. 2. 신동우.-
dc.description.abstractThis paper presents the design and implementation of general matrix-matrix multiplication (GEMM) algorithm for the second generation Intel Xeon Phi processor codenamed Knights Landing (KNL). We illustrate several developing guidelines to achieve optimal performance with C programming language and the Advanced Vector Extensions (AVX-512) instruction set. Further, we present several environment variable issues associated with parallelization on the KNL. On a single core of the KNL, our double-precision GEMM (DGEMM) implementation achieves up to 99 percent of DGEMM performance using the Intel MKL, which is the current state-of-the-art library. Our parallel implementation for 68 cores of the KNL also achieves good scaling results, up to 93 percent of DGEMM performance using the Intel MKL.-
dc.description.tableofcontents1 Introduction 1
2 Hardware and software descriptions 4
2.1 Hardware description 4
2.2 Software description 6
3 Implementing algorithms 8
3.1 Blocked matrix multiplication 10
3.2 Inner kernel 15
3.3 Packing algorithm 19
3.4 Parallelization 21
3.4.1 Parallelization of blocked matrix multiplication algorithm 21
3.4.2 Parallelization of packing kernel 22
3.4.3 Environment setting 23
4 Experiments 26
4.1 Sequential double-precision GEMM 28
4.1.1 Register blocking 28
4.1.2 Cache blocking 31
4.2 Parallel double-precision GEMM 38
4.2.1 Bandwidth requirement 38
4.2.2 Degree of parallelization 40
5 Conclusion and future works 44
Abstract (in Korean) 49
Acknowledgement (in Korean) 50
-
dc.formatapplication/pdf-
dc.format.extent2756219 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectKnights Landing-
dc.subjectgeneral matrix-matrix multiplication-
dc.subjectvectorization-
dc.subjectoptimization-
dc.subject.ddc510-
dc.titleImplementing general matrix-matrix multiplication algorithm on the Intel Xeon Phi Knights Landing Processor-
dc.title.alternativeIntel Xeon Phi Knights Landing 프로세서에서의 일반 행렬 곱셈 알고리즘 구현-
dc.typeThesis-
dc.contributor.AlternativeAuthorRaehyun Kim-
dc.description.degreeMaster-
dc.contributor.affiliation자연과학대학 수리과학부-
dc.date.awarded2018-02-
Appears in Collections:
College of Natural Sciences (자연과학대학)Dept. of Mathematical Sciences (수리과학부)Theses (Master's Degree_수리과학부)
Files in This Item:
  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse