Learning and Generalization of Dynamic Movement Primitives by Hierarchical Deep Reinforcement Learning

김원철

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Learning and Generalization of Dynamic Movement Primitives by Hierarchical Deep Reinforcement Learning : 계층적 심층 강화학습을 활용한 동적 단위 동작의 학습 및 일반화

DC Field	Value	Language
dc.contributor.advisor	김현진	-
dc.contributor.author	김원철	-
dc.date.accessioned	2018-12-03T01:46:34Z	-
dc.date.available	2018-12-03T01:46:34Z	-
dc.date.issued	2018-08	-
dc.identifier.other	000000153294	-
dc.identifier.uri	https://hdl.handle.net/10371/143961	-
dc.description	학위논문 (석사)-- 서울대학교 대학원 : 공과대학 기계항공공학부, 2018. 8. 김현진.	-
dc.description.abstract	This paper presents an approach to learn and generalize robotic skills from a demonstration using deep reinforcement learning (deep RL). Dynamic Movement Primitives (DMPs) formulate a nonlinear differential equation and produce the observed movement from a demonstration. However, it is hard to generate new behaviors from using DMPs. Thus, we apply DMPs framework into deep RL as an initial setting for learning the robotic skills. First, we build a network to represent this differential equation, and learn and generalize the movements by optimizing the shape of DMPs with respect to the rewards up to the end of each sequence of movement primitives. In order to do this, we consider a deterministic actor-critic algorithm for deep RL and we also apply a hierarchical strategy. This drastically reduces the search space for a robot by decomposing the task, which allows to solve the sparse reward problem from a complex task. In order to integrate DMPs with hierarchical deep RL, the differential equation is considered as temporal abstraction of option. The overall structure is mainly composed of two controllers: meta-controller and sub-controller. The meta-controller learns a policy over intrinsic goals and a sub-controller learns a policy over actions to accomplish the given goals. We demonstrate our approach on a 6 degree-of-freedom (DOF) arm with a 1-DOF gripper and evaluate that DMPs are learned and generalized using deep RL with a pick-and-place task.	-
dc.description.tableofcontents	1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Thesis contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Background Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 Dynamic Movement Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 Model-free Reinforcement Learning with Dynamic Movement Primitives . . . . . . . . . . . . 10 3.1 Object detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Pre-training DMPs for networks of deep RL . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Reinforcement Learning to learn DMPs . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4 Hierarchical Deep Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.1 Learning and improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.2 Learning and Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.3 Pick and Place Manipulation Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31	-
dc.format	application/pdf	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 대학원	-
dc.subject.ddc	621	-
dc.title	Learning and Generalization of Dynamic Movement Primitives by Hierarchical Deep Reinforcement Learning	-
dc.title.alternative	계층적 심층 강화학습을 활용한 동적 단위 동작의 학습 및 일반화	-
dc.type	Thesis	-
dc.contributor.AlternativeAuthor	Wonchul Kim	-
dc.description.degree	Master	-
dc.contributor.affiliation	공과대학 기계항공공학부	-
dc.date.awarded	2018-08	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Mechanical Aerospace Engineering (기계항공공학부)
  - Theses (Master's Degree_기계항공공학부)

Files in This Item:

000000153294.pdf 7.30 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share