Publications
Detailed Information
A Unified Approach on Bayesian Optimization of Deep Neural Network : 심층 신경망의 베이지안 최적화에 대한 통합적 접근법
Cited 0 time in
Web of Science
Cited 0 time in Scopus
- Authors
- Advisor
- 이원종
- Major
- 융합과학기술대학원 융합과학부
- Issue Date
- 2017-08
- Publisher
- 서울대학교 융합과학기술대학원
- Keywords
- Bayesian Optimization ; Hyper-parameter Optimization ; Deep Neural Network ; Multi-armed Bandit ; Acquisition Function
- Description
- 학위논문 (석사)-- 서울대학교 융합과학기술대학원 융합과학부, 2017. 8. 이원종.
- Abstract
- Due to the success of deep neural network (DNN) in recent years, DNN has begun to be applied in many research and application fields. At the same time, optimizing the hyper-parameters of DNNs has become an important issue because the performance of network is very sensitive to the setting of the hyper-parameters, such as the size of the convolutional filter or the number of neurons in the fully connected layer. Therefore, active research is being conducted on hyper-parameter optimization (HPO), and the Bayesian optimization method has shown promising results. However, there are some limitations with the Bayesian optimization. First, there still exists various options that need to be set in the optimization process itself
models and acquisition functions. Although the optimization varies greatly depending on the model and the acquisition function, we do not know which option is the best for a given dataset. Therefore, currently, it is all about choosing one that seems appropriate and hoping to be lucky. Second, the acquisition functions do not reflect the time-cost, which leads to time-inefficient selections. If we can predict how much time a candidate will spend beforehand, we can make a more cost-efficient choice.
In order to solve these two problems, we propose the following method. First, we suggest a unifying approach which adapts to the problem and dataset within a single optimization process. It first explores which models and acquisition function performs well, and then converges to well performing options to exploit them. Second, we devise the acquisition function that takes time-cost into account. To do so, we must be able to predict the training time of the DNN. Hence we also propose a fast and accurate method to predict the training time of the DNN. Based on these two methods, we tried to improve Bayesian optimization, and we conducted several experiments to confirm the effect of unifying approach and time-efficient acquisition functions. Through the experiment, we achieve improvements in terms of time consumption and performance.
- Language
- English
- Files in This Item:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.