A Unified Approach on Bayesian Optimization of Deep Neural Network

노승은

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

A Unified Approach on Bayesian Optimization of Deep Neural Network : 심층 신경망의 베이지안 최적화에 대한 통합적 접근법

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 노승은

Advisor: 이원종

Major: 융합과학기술대학원 융합과학부

Issue Date: 2017-08

Publisher: 서울대학교 융합과학기술대학원

Keywords: Bayesian Optimization ; Hyper-parameter Optimization ; Deep Neural Network ; Multi-armed Bandit ; Acquisition Function

Description: 학위논문 (석사)-- 서울대학교 융합과학기술대학원 융합과학부, 2017. 8. 이원종.

Abstract: Due to the success of deep neural network (DNN) in recent years, DNN has begun to be applied in many research and application fields. At the same time, optimizing the hyper-parameters of DNNs has become an important issue because the performance of network is very sensitive to the setting of the hyper-parameters, such as the size of the convolutional filter or the number of neurons in the fully connected layer. Therefore, active research is being conducted on hyper-parameter optimization (HPO), and the Bayesian optimization method has shown promising results. However, there are some limitations with the Bayesian optimization. First, there still exists various options that need to be set in the optimization process itself
models and acquisition functions. Although the optimization varies greatly depending on the model and the acquisition function, we do not know which option is the best for a given dataset. Therefore, currently, it is all about choosing one that seems appropriate and hoping to be lucky. Second, the acquisition functions do not reflect the time-cost, which leads to time-inefficient selections. If we can predict how much time a candidate will spend beforehand, we can make a more cost-efficient choice.
In order to solve these two problems, we propose the following method. First, we suggest a unifying approach which adapts to the problem and dataset within a single optimization process. It first explores which models and acquisition function performs well, and then converges to well performing options to exploit them. Second, we devise the acquisition function that takes time-cost into account. To do so, we must be able to predict the training time of the DNN. Hence we also propose a fast and accurate method to predict the training time of the DNN. Based on these two methods, we tried to improve Bayesian optimization, and we conducted several experiments to confirm the effect of unifying approach and time-efficient acquisition functions. Through the experiment, we achieve improvements in terms of time consumption and performance.

Language: English

URI: https://hdl.handle.net/10371/137956

Files in This Item:

000000145547.pdf 1.91 MB

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Transdisciplinary Studies(융합과학부)
  - Theses (Master's Degree_융합과학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share