S-Space College of Engineering/Engineering Practice School (공과대학/대학원) Dept. of Computer Science and Engineering (컴퓨터공학부) Theses (Master's Degree_컴퓨터공학부)
Deep Reinforcement Learning with LSTM-based Exploration Bonus
- 공과대학 컴퓨터공학부
- Issue Date
- 서울대학교 대학원
- Deep Learning; Deep Reinforcement Learning; Long-Short Term Memory Networks; Exploration and Exploitation Trade-O ff
- 학위논문 (석사)-- 서울대학교 대학원 : 컴퓨터공학부, 2017. 2. 유석인.
- Deep learning is dominant method in recent machine learning community. Deep learning outperforms traditional methods in various tasks such as image classification, object recognition, speech recognition, and natural language processing. However, most deep learning algorithm is focused on supervised learning task. Supervised learning only considers static environment therefore it is not proper to dynamic environment.
To solve this problem, reinforcement learning method combined with deep learning called deep reinforcement learning is proposed. Deep reinforcement learning is composed of two parts. First part is feature extraction using deep learning method. Second part is learning proper action for agent using reinforcement learning through trial and error.
However, reinforcement learning algorithm has exploration and exploration trade-off problem. It is hard to find optimal exploration and exploitation ratio for reinforcement learning. To solve this problem, we propose novel deep reinforcement learning algorithm with LSTM-based exploration bonus. LSTM-based exploration bonus uses Long-Short Term Memory (LSTM) networks as predictor and makes exploration bonus based on prediction error. LSTM-based exploration bonus guides agent play more daring action. As a result, agent could find optimal solution in more short time.
We test our method playing various Atari games. Experimental results shows our method outperforms plain deep reinforcement learning method. This shows the effectiveness of our method.