Automatic Generation of Efficient Execution Plan for Convolutional Neural Networks

김민수

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Automatic Generation of Efficient Execution Plan for Convolutional Neural Networks : 합성곱 신경망의 효율적인 실행을 위한 실행 계획 자동 생성

DC Field	Value	Language
dc.contributor.advisor	Bernhard Egger	-
dc.contributor.author	김민수	-
dc.date.accessioned	2020-10-13T02:58:17Z	-
dc.date.available	2020-10-13T02:58:17Z	-
dc.date.issued	2020	-
dc.identifier.other	000000161375	-
dc.identifier.uri	https://hdl.handle.net/10371/169354	-
dc.identifier.uri	http://dcollection.snu.ac.kr/common/orgView/000000161375	ko_KR
dc.description	학위논문 (석사) -- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2020. 8. Bernhard Egger.	-
dc.description.abstract	Over the past years, a large number of architectures and accelerators for Deep Neural Networks (DNNs) have been proposed. While exhibiting common features, the number and arrangement of processing elements, the sizes and types of on-chip memory, and the possibilities of parallel execution vary significantly especially in the embedded system domain. The number of off-chip memory accesses and the performance of a DNN on a given accelerator depends not only on the supported computational patterns and the available on-chip memory but also on the sizes and shapes of each layer. Finding a computational pattern that minimizes off-chip memory accesses while maximizing performance is thus a tedious and error-prone task. This thesis presents e-PlaNNer, a compiler framework that generates an optimized execution plan for a given embedded accelerator and Convolutional Neural Network (CNN). For each layer, e-PlaNNer determines the performance-optimal configuration by considering the data movement, tiling, and work distribution. The generated execution plan is transformed to code, allowing for a fast development cycle with different CNNs and hardware accelerators. Evaluated with five neural networks under varying memory configurations and compared to previous works on the Nvidia Jetson TX2, e-PlaNNer achieves 6x speedup and 21.14% reduction of off-chip memory access volume on average. In addition, e-PlaNNer shows meaningful performance compared to well-known deep learning frameworks in terms of end-to-end execution.	-
dc.description.abstract	지난 몇 년간 심층신경망을 위한 수많은 아키텍처와 가속기가 제안되었다. 이를 통해, 일반적인 심층신경망 수행 방식들이 함께 제안되었으나, 구체적인 연산 배치 방식과 온칩 메모리의 크기 및 종류, 그리고 병렬 실행 방식은 특히 내장형 시스템에서 다양하게 나타날 수 있다. 뿐만 아니라, 오프칩 메모리 접근 크기 및 신경망의 성능은 연산 형태 및 온칩 메모리의 크기 뿐 아니라 신경망 각 계층의 크기 및 형태에 따라서 달라질 수 있다. 따라서, 최대 성능을 내면서 오프칩 메모리 접근을 최소화하는 연산 형태를 일일이 찾는 것은 상당히 번거로운 작업이며, 많은 오류를 발생 시킬 수 있다. 본 논문에서 소개할 e-PlaNNer는 주어진 내장형 하드웨어 가속기와 합성곱 신경망에 대하여 최적화된 실행 계획을 생성해주는 컴파일러 프레임워크이다. e-PlaNNer는 심층신경망의 각 신경망 계층에 대하여 데이터 이동, 타일링, 그리고 작업 배분을 고려한 성능 최적화된 실행 계획을 결정한다. 또한, 생성된 실행 계획을 실제 컴파일 가능한 코드로 변환함으로써, 서로 다른 다양한 합성곱 신경망과 하드웨어 가속기에 대하여 빠른 개발 주기를 제공한다. 다양한 메모리 구성으로 다섯 가지 합성곱 신경망 응용을 Nvidia의 Jetson TX2 에서 검증하여 기존의 연구와 비교한 결과, e-PlaNNer는 평균적으로 6배의 성능 향상과 21.14% 의 오프칩 메모리 데이터 접근량 감소를 보였다. 뿐만 아니라, e-PlaNNer는 전체 심층신경망의 실행 관점에서 기존에 잘 알려진 딥러닝 프레임워크와의 비교에서도 의미있는 결과를 보였다.	-
dc.description.tableofcontents	Chapter 1 Introduction 1 Chapter 2 Related Work 5 Chapter 3 Background 8 3.1 Convolutional Neural Networks 8 3.2 DNN Accelerator 9 3.3 Roofline Model 11 Chapter 4 Graph Level Processing 13 4.1 Graph Construction 13 4.2 Schedule Caching 14 Chapter 5 Convolutional Layer Analysis 15 5.1 Loop Structure 16 5.2 Loop Tiling 17 5.3 Dataflow 18 Chapter 6 Execution Planning 20 6.1 Architecture Con figurations 20 6.2 Modeling Off-Chip Memory Accesses 22 6.3 Modeling Performance 24 6.4 Search Space Exploration 25 Chapter 7 Code Generation 32 7.1 Intermediate Representation 33 7.2 Target Code Generation 34 Chapter 8 Evaluation 36 8.1 Experimental Setup 36 8.2 Performance Results 39 8.3 Comparison of Off-chip Memory Access 40 8.4 Framework Results 42 Chapter 9 Discussion 46 Chapter 10 Conclusion 47 Bibliography 48 요약 57	-
dc.language.iso	eng	-
dc.publisher	서울대학교 대학원	-
dc.subject	Convolutional Neural Network	-
dc.subject	Compiler	-
dc.subject	Execution Plan	-
dc.subject	합성곱 신경망	-
dc.subject	컴파일러	-
dc.subject	실행 계획	-
dc.subject.ddc	621.39	-
dc.title	Automatic Generation of Efficient Execution Plan for Convolutional Neural Networks	-
dc.title.alternative	합성곱 신경망의 효율적인 실행을 위한 실행 계획 자동 생성	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	Minsu Kim	-
dc.contributor.department	공과대학 컴퓨터공학부	-
dc.description.degree	Master	-
dc.date.awarded	2020-08	-
dc.identifier.uci	I804:11032-000000161375	-
dc.identifier.holdings	000000000043▲000000000048▲000000161375▲	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Files in This Item:

000000161375.pdf 2.79 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share