Auturi: An AUTomatic and Unified Framework for Searching Parallelization Configurations in Deep Reinforcement Learning

정채현

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Auturi: An AUTomatic and Unified Framework for Searching Parallelization Configurations in Deep Reinforcement Learning : 딥강화학습의 데이터 수집부 병렬화 최적화를 위한 프레임워크 개발

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 정채현

Advisor: 전병곤

Issue Date: 2023

Publisher: 서울대학교 대학원

Keywords: 강화학습 ; 병렬화시스템

Description: 학위논문(석사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2023. 2. 전병곤.

Abstract: Deep reinforcement learning (DRL) has effectively been used in a wide range of
challenging tasks. Despite its growing popularity, RL practitioners frequently experience excessively long training times. One of the major bottlenecks for this inefficiency in training is that RL must col-
lect training dataset by itself during training iterations. To solve the bottleneck, many
researchers proposed various strategies, parallelizing each component of DRL. However, the best parallelization technique varies significantly, depending on the different tasks and given hardware circumstances. Each strategy shows differences in terms of synchronization and data copy overhead due to its distinctive structure, and the effect of such overhead differs by task. Thus, choosing the best strategy is a heavy burden for users.
In this paper, we propose Auturi, a system that automatically generates the optimal configuration based on an efficient and unified code base to run hybrid parallelization. Auturi takes an online exploration approach testing each strategy one by one. Our evaluation shows that Auturi chooses an optimal configuration to maximize the speed of the experience collection loop.
딥 강화학습(Deep Reinforcement Learning)은 로보틱스, 게임, 컴파일러 등 다양한 분야에서 도전적인 과제를 학습하는데 큰 성공을 거두어왔다. DRL의 인기가 높아졌음에도 불구하고, DRL은 종종 학습에 지나치게 긴 시간이 들었는데,
그 주요 병목 현상 중 하나는 RL이 훈련 과정 내에 자체적으로 훈련을 위한 데이터셋을 스스로 만들어야 한다는 점이다.

이러한 병목 지점을 해결하기 위해, 많은 연구자들은 DRL을 구성하는 요소인 환경(Environment), 정책 네트워크(Policy network) 등을 병렬화하는 등의 다양한 전략을 제시했다. 그러나 주어진 하드웨어 환경, 수행하고자 하는 과제에 따라 최적의 병렬화 전략이 달라진다. 병렬화 전략마다 고유한 구조로 인한 동기화 및 데이터 복사 오버헤드가 크게 차이가 나는데, 각 과제(task)마다 이에 미치는 영향이 다르기 때문이다.

본 논문에서는 자동으로 최적의 병렬화 전략을 찾아주는 시스템인 Auturi를 소개한다. 하이브리드 병렬화 전략을 효율적이고 유연하게 실행할 수 있는 코드를 기반으로, Auturi는 각 전략을 하나씩 테스트하는 온라인 탐색 방식을 취한다.
논문에서는 널리 쓰이는 DRL 벤치마크에서 실험함으로써, Auturi가 DRL 훈련시간을 효과적으로 줄일 수 있음을 보인다.

Language: kor

URI: https://hdl.handle.net/10371/193326

https://dcollection.snu.ac.kr/common/orgView/000000176662

Files in This Item:

000000176662.pdf 2.29 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share