모바일 GPU에서의 딥 러닝 모델 추론을 위한 메모리 관리 및 최적화

강수연

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

모바일 GPU에서의 딥 러닝 모델 추론을 위한 메모리 관리 및 최적화 : Memory Management and Optimization for Deep Learning Model Inference on Mobile GPU

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 강수연

Advisor: 이재진

Issue Date: 2022

Publisher: 서울대학교 대학원

Keywords: 모바일GPU ; 딥러닝 ; 메모리관리기법 ; Convolution최적화 ; OpenCL

Description: 학위논문(석사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2022. 8. 이재진.

Abstract: 모바일에서의 이미지 처리 가속 기술이 나날이 중요해지고 있다. 모바일에서는 CPU와 GPU 뿐 아니라 이미지 처리를 위한 ISP(Image Signal Processing) 및 DSP(Digital Signal Processing)를 함께 탑재하고 있어, 각 프로세서가 담당하는 이미지 처리 알고리즘에 따라 이미지를 처리한다. 하지만 모바일에서의 이미지 처리 과정이 점점 복잡해짐에 따라 ISP나 DSP를 구축하는데 많은 노동력과 비용이 들어가게 된다. 그럼에도 불구하고 ISP를 대체할 수 있는 모바일 GPU를 활용한 이미지 처리 기술 개발이 부족한 실정이다. ISP를 대체하는 딥 러닝 모델인 PyNET이 존재하지만 딥 러닝 프레임워크로 개발된 모델은 모바일에서 메모리 부족 문제로 추론이 불가능하다.
본 연구에서는 ISP를 대체하는 딥 러닝 모델인 PyNET을 모바일 GPU에서 추론하기 위한 최적화를 진행한다. 이 과정에서 딥 러닝 모델을 추론할 때 발생하는 메모리 부족 문제와 latency 문제를 해결하기 위한 다양한 메모리 관리 기법과 OpenCL 기반의 convolution 최적화 기법을 제시한다. 최적화를 통해 딥 러닝 프레임워크로는 모바일에서 추론이 불가능한 모델을 실제 모바일 디바이스에서 실험함으로써 이를 검증한다.
Image processing acceleration technology on mobile is becoming more important day by day. Mobile is equipped with ISP(image signal processing) and DSP(digital signal processing) for image processing as well as CPU and GPU, so each processor processes images according to the image processing algorithm in charge. However, as the image processing process on mobile becomes increasingly complex, it costs a lot of labor and money to build an ISP or DSP. Nevertheless, there is a lack of development of image processing technology using mobile GPUs that can replace ISPs. Although the PyNET, a deep learning model that replaces ISPs, exists, the model developed by a deep learning framework is impossible to infer due to a memory shortage problem on mobile. In this work, we proceed with optimization to infer the PyNET on mobile GPU. In this process, we present various memory management techniques and convolution optimization techniques based on OpenCL to solve the memory shortage and latency problems that arise when inferring deep learning models. Through optimization, we validate this by experimenting with a model that cannot be inferred on mobile by a deep learning framework on real mobile devices.

Language: kor

URI: https://hdl.handle.net/10371/187793

https://dcollection.snu.ac.kr/common/orgView/000000172846

Files in This Item:

000000172846.pdf 1.58 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share