Inception V4 Network의 FPGA 구현을 위한 데이터 재사용 최적화

송병기

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Inception V4 Network의 FPGA 구현을 위한 데이터 재사용 최적화 : Data reuse optimization for an FPGA implementation of Inception V4 Network

DC Field	Value	Language
dc.contributor.advisor	이혁재	-
dc.contributor.author	송병기	-
dc.date.accessioned	2021-11-30T02:21:47Z	-
dc.date.available	2021-11-30T02:21:47Z	-
dc.date.issued	2021-02	-
dc.identifier.other	000000164300	-
dc.identifier.uri	https://hdl.handle.net/10371/175285	-
dc.identifier.uri	https://dcollection.snu.ac.kr/common/orgView/000000164300	ko_KR
dc.description	학위논문 (석사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2021. 2. 이혁재.	-
dc.description.abstract	최근 컴퓨터 비전 분야에서 Deep Convolution Neural Network가 높은 성능을 보이고 있고, FPGA를 이용하여 이미지 추론을 가속하는 연구가 활발히 진행되고 있다. Deep CNN은 깊은 네트워크 특성때문에 많은 양의 weight 파라미터와 중간 feature map 데이터를 생성한다. 이로 인해, FPGA 상에서 추론할 때 많은 off-chip 메모리 접근을 하게 되고 이는 가속기 추론 속도 성능과 에너지 효율의 bottleneck으로 작용한다. 위 문제를 해결하기 위해 한 번 off-chip 메모리에 접근하여 가져온 데이터를 on-chip에서 최대한 재사용하는 방법들이 소개되었다. 하지만 기존의 데이터 재사용 방법들은 이미지 분류에서 높은 성능을 보이는 Inception V4 네트워크에 최적의 결과를 내지 못하는 모습을 보인다. 본 논문에서는 Inception V4 네트워크의 branch 구조를 고려하여 데이터를 on-chip에서 최대한 많이 재사용하는 Mixed convolution 방법을 제안한다. Mixed convolution은 Inception 모듈의 입력 feature map 데이터를 재사용하는 Grouped convolution과 branch 내에서 생성되는 중간 feature map 데이터를 재사용하는 Fused convolution을 모두 사용하는 것으로 2가지 방법의 장점을 모두 이용한다. 그 결과, Inception 모듈에서 생성되는 feature map 데이터에 대해서 421KB의 추가 on-chip 버퍼 메모리를 사용하여 off-chip 메모리 데이터 전송량을 37MB에서 12MB로, baseline대비 66.4% 감소시켰다. 또한, on-chip 버퍼 메모리를 최적화하기 위해 Inception-C 모듈에 full weight 재사용 방법을 사용함으로써 218KB의 추가 on-chip 버퍼 메모리를 사용하여 off-chip 메모리 데이터 전송량을 11MB로 더욱 줄여 baseline대비 68.6% 감소시켰다.	-
dc.description.abstract	Deep Convolutional neural networks(DCNN) has been widely used in computer vision and achieved high performance enhancement. In addition, a lot of accelerator designs has been proposed using FPGA for CNN inference. DCNNs generate huge amounts of weight parameters and intermediate feature map data which requires many off-chip memory accesses during inference on FPGA accelerator. This leads to performance degradation and poor energy efficiency. To reduce off-chip memory accesses, various of data reuse methods have been proposed. However, previous data reuse methods show low reusability on Inception V4 network which has high performance on image classification. Considering branch topology of inception module, proposed data reuse method named Mixed convolution reuse feature map data using on-chip memory. Mixed convolution takes advantages of both Grouped convolution and Fused convolution which reuse input feature map data of inception module and intermediate feature map data of a branch respectively. As a result, Mixed convolution minimizes off chip feature map data transfer of inception modules, reducing by 66.4%, from 37MB to 12MB using extra 421KB on-chip buffer memory. In addition, to optimize on-chip buffer memory size required to minimize off-chip data transfer, Full weight reuse dataflow is applied to Inception-C module which results in reduction of off-chip feature map data transfer of inception module, reducing by 68.6%, from 37MB to 11MB using extra 218KB on-chip buffer memory.	-
dc.description.tableofcontents	제 1 장 서 론 1 1.1 연구의 배경 및 내용 1 1.2 논문 구성 4 제 2 장 관련 연구 5 2.1 Inception Network 5 2.2 FPGA 가속기 연구 9 2.2.1 Off chip memory 접근 관련 연구 9 제 3 장 Inception v4 Network의 데이터 재사용 방법 14 3.1 Grouped Conovlution 14 3.1.1 Grouped Conovlution의 동작 방식 16 3.1.2 Grouped Conovlution의 on chip 버퍼 22 3.2 Fused Conovlution 23 3.2.1 Fused Conovlution의 동작 방식 24 3.2.2 Fused Conovlution의 fused 버퍼 크기 26 3.3 Mixed Conovlution 27 3.3.1 Mixed Conovlution의 동작 방식 27 3.3.2 Mixed Conovlution의 on chip 버퍼 크기 30 3.4 Full weight 재사용을 적용한 Mixed Conovlution 31 제 4 장 실험 결과 및 분석 33 4.1 FPGA 하드웨어 가속기 시스템 33 4.2 가속기 모듈 내부 구조 34 4.3 Mixed convolution을 지원하는 Data Controller 구현 37 제 5 장 실험 결과 및 분석 39 5.1 Off-chip 메모리 데이터 전송 크기 비교 39 5.2 On-chip 버퍼 크기 비교 42 5.3 Full weight 재사용과 on-chip 버퍼 크기 분석 45 5.4 FPGA 리소스 사용량 비교 분석 46 제 6 장 결론 48 참고문헌 49 Abstract 51	-
dc.format.extent	v. 52	-
dc.language.iso	kor	-
dc.publisher	서울대학교 대학원	-
dc.subject	데이터 재사용	-
dc.subject	Off chip 메모리 접근 데이터 크기	-
dc.subject	CNN 가속기	-
dc.subject	Inception V4 네트워크	-
dc.subject	Data Reuse	-
dc.subject	Off chip memory data transfer size	-
dc.subject	CNN Accelerator	-
dc.subject	Inception V4 Network	-
dc.subject.ddc	621.3	-
dc.title	Inception V4 Network의 FPGA 구현을 위한 데이터 재사용 최적화	-
dc.title.alternative	Data reuse optimization for an FPGA implementation of Inception V4 Network	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	BYEONGKI SONG	-
dc.contributor.department	공과대학 전기·정보공학부	-
dc.description.degree	Master	-
dc.date.awarded	2021-02	-
dc.identifier.uci	I804:11032-000000164300	-
dc.identifier.holdings	000000000044▲000000000050▲000000164300▲	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Master's Degree_전기·정보공학부)

Files in This Item:

000000164300.pdf 1.24 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share