Data-Efficient Learning for Robot Manipulators using Residual Dynamics

김병헌

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Data-Efficient Learning for Robot Manipulators using Residual Dynamics : 동역학 모델을 이용한 데이터 효율적 로봇 머니퓰레이터 학습

DC Field	Value	Language
dc.contributor.advisor	박종우	-
dc.contributor.author	김병헌	-
dc.date.accessioned	2019-05-07T03:07:06Z	-
dc.date.available	2019-05-07T03:07:06Z	-
dc.date.issued	2019-02	-
dc.identifier.other	000000154839	-
dc.identifier.uri	https://hdl.handle.net/10371/150631	-
dc.description	학위논문 (석사)-- 서울대학교 대학원 : 공과대학 기계항공공학부, 2019. 2. 박종우.	-
dc.description.abstract	모델 기반의 강화학습은 데이터 효율성에 있어서 큰 각광을 받아 왔고, PILCO 알고리즘이 확률적 동역학 모델을 가우시안 프로세스로 모델링함으로써 실제 로봇에 성공적으로 적용되었다. 우리는 필코에서 한 발짝 더 나아가 시스템 동역학 중 어느 정도의 동역학을 알고 있다고 가정하였을 때 적용할 수 있는 방법론을 제시하였다. 예를 들어 동역학을 알고 있는 로봇 매니퓰레이터 끝에 모델링 되지 않은 물체가 달려있는 상황을 들 수 있다. 실제 동역학에서 로봇 동역학을 뺀 여분의 동역학을 확률적 가우시안 프로세스로 모델링 하였다. 로봇 동역학은 제어 최적화를 위해 필요한 동역학 미분을 정확하며 분석적인 형태로 구할 수 있도록 리 그룹으로 표현하였다. 이 방법을 KUKA LWR iiwa 14 R820 로봇 매니퓰레이터로 펜들럼을 돌려 세우는 태스크에 적용하였고 실험 결과를 통해 우리의 방법이 PILCO 보다 나은 데이터 효율성을 보여주는 것을 확인할 수 있었다.	-
dc.description.abstract	In this thesis, we leverage Lie group robot dynamics with the probabilistic inference for learning control (PILCO) algorithm to develop a more effective model-based reinforcement learning robot control algorithm. Our method is particularly effective for robot systems in which only a part of the dynamics of the system is known, e.g., an object with unknown mass and inertia grasped by a robot with known dynamics. Using Gaussian processes (GP) for the probabilistic dynamic model, our method learns the residual dynamics, i.e., the difference between the known robot dynamics and the actual dynamics. The known part of the robot dynamics is expressed using Lie group methods and provides exact, closed-form analytic derivatives of the dynamics. Our algorithm is validated through numerical experiments for a pendulum swing-up task with a KUKA LWR iiwa 14 R820 robot, with results benchmarked against standard implementations of PILCO.	-
dc.description.tableofcontents	Abstract (English) 1. Introduction 1 2. Preliminaries 5 2.1 Probabilistic Inference for Learning Control (PILCO) 5 2.1.1 Gaussian Processes 6 2.1.2 Dynamics Model Learning 7 2.1.3 Policy Optimization 8 2.2 Lie Group Dynamics 10 2.2.1 The Rotation Group 10 2.2.2 The Special Euclidean Group 11 2.2.3 Spatial Velocities and Spatial Forces 12 2.2.4 Adjoint Mapping 13 2.2.5 Forward Kinematics 14 2.2.6 Single Rigid Body Dynamics 16 2.2.7 Recursive Inverse Dynamics and Its Derivatives 17 3. PILCO with Robot Dynamics 19 3.1 Robot Dynamics 19 3.1.1 Forward Dynamics and Its Derivatives 19 3.1.2 Known Dynamics 20 3.2 Policy Evaluation and Improvement 21 3.2.1 Long-Term Predictions for Policy Evaluation 22 3.2.2 Analytic Gradient for Policy Improvement 25 3.3 Policy and Cost Function 27 3.3.1 Policy 27 3.3.2 Cost Function 30 3.4 Algorithm 30 4. Numerical Experiments 32 4.1 Problem Formulation 32 4.2 Algorithm 34 4.3 Results 37 4.3.1 Data Efficiency 37 4.3.2 Computational Efficiency 39 5. Conclusion 45 A. Appendix 47 A.1 Recursive Inverse Dynamics and Its Derivatives 47 Bibliography 51 Abstract (Korean) 54	-
dc.language.iso	eng	-
dc.publisher	서울대학교 대학원	-
dc.subject.ddc	621	-
dc.title	Data-Efficient Learning for Robot Manipulators using Residual Dynamics	-
dc.title.alternative	동역학 모델을 이용한 데이터 효율적 로봇 머니퓰레이터 학습	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	Kim, Byungheon	-
dc.description.degree	Master	-
dc.contributor.affiliation	공과대학 기계항공공학부	-
dc.date.awarded	2019-02	-
dc.contributor.major	로봇 공학	-
dc.identifier.uci	I804:11032-000000154839	-
dc.identifier.holdings	000000000026▲000000000039▲000000154839▲	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Mechanical Aerospace Engineering (기계항공공학부)
  - Theses (Master's Degree_기계항공공학부)

Files in This Item:

000000154839.pdf 3.09 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share