Multi-Robot Environmental Learning and Target Tracking with Distributed Gaussian Process

Abstract: 본 논문은 다중 로봇 시스템에서의 분산 환경 학습 기법을 다루며, 이를 다중 표적 탐색 및 추적 문제에 적용한다. 다중 로봇 시스템은 기본적으로 로봇의 협동 작업을 통해 신뢰성, 효율성 및 확장성의 장점을 얻는다. 이러한 다중 로봇 시스템은 데이터 기반 환경 학습에 적용하기 용이하다. 데이터 기반 환경 학습은 특정 관심 영역에서 다수의 센서 데이터를 획득하여 관심 영역에 대한 포괄적인 정보를 얻는 기법이다. 그러나 다중 로봇이 이러한 작업을 수행하기 위해서는 분산 학습 알고리즘과 협업 알고리즘이 필수적으로 요구된다.

본 논문은 첫째로 분산 환경 학습 알고리즘을 중점적으로 다룬다. 사전에 알려지지 않은 환경 과정에 대하여, 다중 로봇이 현재 위치에서 잡음이 포함된 센서 데이터를 획득한다면, 가우시안 프로세스 회귀 알고리즘을 통해 신뢰 구간을 바탕으로 공간 정보 지도를 구성할 수 있다. 그러나 기존의 가우시안 프로세스 알고리즘은 중앙 집중형으로 동작하기 때문에, 공간 상에 넓게 분포한 다수의 센서로부터 오는 정보를 실시간으로 처리하기 어렵다. 이 논문에서는 다음과 같은 도전 과제를 해결할 수 있는 다중 로봇 탐색 알고리즘을 제안한다: i) 네트워크 센싱 플랫폼을 사용한 분산 환경 지도 구축, ii) 다중 로봇 팀에 적합하도록, 연속적인 센서 데이터 측정 및 융합을 사용한 온라인 학습, iii) 알려지지 않은 환경 과정의 최고점을 탐색하기 위한 다중 로봇 능동 감지 및 제어 기법. 이러한 알고리즘의 효율성을 다수의 UAV를 사용한 시뮬레이션과 지형 조사 실험을 통해 검증한다.

그러나 이러한 기법은 협력 탐색 과정에서의 경로 계획이 부재한 관계로, 로봇들의 근시안적인 행동을 초래한다. 따라서 다음 장에선 완전히 분산화 된 방식으로 동작하는 다중 로봇의 정보량 기반의 경로 계획 알고리즘을 제안한다. 이 알고리즘은 다음과 같은 도전 과제의 해결을 목표로 한다: i) 다중 로봇을 사용한 온라인 분산 환경 지도 학습, ii) 학습된 지도를 기반으로 안전하고 효율적인 탐색 경로 생성, iii) 로봇 수의 변화에 대한 확장성 유지. 이를 위해 전체 과정을 환경 학습과 경로 계획의 두 단계로 나눈다. 각 단계에 분산화된 알고리즘을 적용하고, 오직 인접 로봇 간의 통신을 통해 두 알고리즘을 결합한다. 환경 학습 알고리즘은 분산 가우시안 프로세스를 사용하고, 경로 계획 알고리즘은 분산 몬테카를로 트리 탐색을 이용한다. 그 결과, 로봇 수의 제약 없이 확장 가능한 다중 로봇 시스템을 구축할 수 있다. 시뮬레이션을 통해 제안된 시스템의 성능과 확장성을 보여주며, 또한 하드웨어 실험으로 보다 현실적인 시나리오에서 알고리즘의 유용성을 검증한다.

마지막으로, 앞서 설명한 환경 학습의 결과를 다중 목표물의 탐색 및 추적 문제에 적용할 수 있다. 목표물의 탐색 및 추적을 위해 여러 대의 로봇을 배치하는 것은 많은 연구에서 다뤄져 왔지만, 표적의 위치가 사전에 알려지지 않았거나 부분적으로 알려진 경우에서의 로봇 경로 계획 문제는 여전히 해결하기 어려운 문제이다. 그러나 최근에 떠오르고 있는 딥러닝과 강화 학습 같은 지능형 제어 기술을 이용하여, 에이전트는 사전 지식 없이 오직 환경과의 상호 작용을 통해 자율적으로 목표물을 탐색 및 추적할 수 있다. 이러한 방법은 데이터 기반 기법으로 동작함으로써, 경로 계획 문제에서의 탐색-활용 트레이드오프를 다룰 수 있으며, 기존 접근 방식의 전형적이고 실험적인 기법을 사용하지 않고도, 종단 간 학습을 통해 의사 결정 과정을 간소화할 수 있다. 이 논문에서는 분산 가우시안 프로세스를 기반으로 목표물 위치 지도를 구축하고, 이를 다중 에이전트 강화 학습에 적용하는 기법을 제안한다. 분산 가우시안 프로세스를 활용하여 대상 위치에 대한 신뢰 지도를 생성하고, 위치를 알 수 없는 목표물에 대해 효율적으로 탐색 및 추적을 할 수 있는 경로 계획법을 고안한다. 훈련된 정책의 성능과 새로운 환경으로의 전이 가능성을 시뮬레이션으로 평가하고, 다중 UAV를 활용한 하드웨어 실험을 통해 검증한다.
Multi-Robot Environmental Learning and Target Tracking with Distributed Gaussian Process

multi-robot system, environmental learning, informative path planning, deep reinforcement learning

This dissertation presents an investigation on distributed environment learning techniques in multi-robot systems and applies them to multi-target search and tracking problems. A multi-robot system has the advantages of reliability, efficiency, and scalability, facilitated by the cooperative work of multiple robots. Such a system can be easily applied to data-driven environmental learning. Data-driven environmental learning is a technique for obtaining comprehensive information on a region of interest by acquiring a large amount of sensor data from a specific region of interest. However, distributed-learning algorithms and collaborative algorithms are required to enable multiple robots to perform such tasks.

The first part of this dissertation focuses on the distributed environment learning algorithm. Given noisy sensor measurements obtained at the location of robots with no prior knowledge of the environmental map, Gaussian process regression can be an efficient solution for constructing a map that represents spatial information with confidence intervals. However, because the conventional Gaussian process algorithm operates in a centralized manner, processing information coming from multiple distributed sensors in real time is difficult. In this work, a multi-robot exploration algorithm is proposed, which deals with the following challenges: i) construction of a distributed environmental map using networked sensing platforms, ii) online learning using successive measurements suitable for a multi-robot team, and iii) active sensing and control for multi-agent coordination to determine the highest peak of an unknown environmental field. The effectiveness of the proposed algorithm is demonstrated through simulation and a topographic survey experiment with multiple unmanned aerial vehicles (UAVs).
However, this technique lacks path planning in the cooperative search process, resulting in myopic behavior. Accordingly, the second part of this dissertation proposes a multi-robot informative path planning algorithm working in a fully distributed manner. This algorithm tackles the following challenges: i) online distributed learning of environmental map using multiple robots, ii) generation of safe and efficient exploration paths based on the learned map, and iii) maintenance of scalability with respect to the number of robots. Accordingly, the entire process is divided into two stages: environmental learning and path planning. Distributed algorithms are applied to each stage and combined through communication between adjacent robots. The learning algorithm uses a distributed Gaussian process, and the path planning algorithm uses a distributed Monte Carlo tree search. Therefore, a scalable system without a constraint on the number of robots is built. Simulation results demonstrate the performance and scalability of the proposed system. Moreover, a hardware experiment validates the utility of the proposed algorithm in a more realistic scenario.

Finally, the results of environmental learning can be applied to search and tracking problems using multi-robots. Deployment of multiple robots for target search and tracking has many practical applications; however, the challenge of planning for unknown or partially known targets remains difficult to address. With recent advances in deep learning, intelligent control techniques such as reinforcement learning have enabled agents with little to no prior knowledge to learn autonomously from environmental interactions. Such methods can address the exploration–exploitation tradeoff of planning for unknown targets in a data-driven manner, eliminating the reliance on heuristics—typical of traditional approaches—and streamlining the decision-making pipeline with end-to-end training. Accordingly, a multi-agent reinforcement learning technique with target map building based on a distributed Gaussian process is proposed. The distributed Gaussian process to encode belief over the target locations is leveraged to efficiently plan for unknown targets. Further, the performance and transferability of the trained policy is evaluated through simulation, and the method is demonstrated through hardware experiments on a swarm of micro UAVs.

Language: eng

URI: https://hdl.handle.net/10371/187798

https://dcollection.snu.ac.kr/common/orgView/000000171868

Files in This Item:

000000171868.pdf 7.99 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of AeroSpace Engineering (항공우주공학과)
  - Theses (Ph.D. / Sc.D._항공우주공학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share