Geometry-Aware Data Augmentation for Sequence-to-sequence Multi-Person 3D Pose Estimation

박성찬

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Geometry-Aware Data Augmentation for Sequence-to-sequence Multi-Person 3D Pose Estimation : 시퀀스 기반 3차원 다인 자세 추정을 위한 기하학적 데이터 증강 기법

DC Field	Value	Language
dc.contributor.advisor	이준석	-
dc.contributor.author	박성찬	-
dc.date.accessioned	2023-06-29T02:08:21Z	-
dc.date.available	2023-06-29T02:08:21Z	-
dc.date.issued	2023	-
dc.identifier.other	000000174713	-
dc.identifier.uri	https://hdl.handle.net/10371/193611	-
dc.identifier.uri	https://dcollection.snu.ac.kr/common/orgView/000000174713	ko_KR
dc.description	학위논문(석사) -- 서울대학교대학원 : 데이터사이언스대학원 데이터사이언스학과, 2023. 2. 이준석.	-
dc.description.abstract	3D pose estimation is an invaluable task in computer vision with various practical applications. Recently, a Transformer-based sequence-to-sequence model, MixSTE [60], has been successfully applied to 3D single-person pose estimation by decoupling the 2Dto-3D modeling from pixel-level details. We propose a natural extension of this model from single-person to multi-person problem, adding a novel inter-personal attention for 2D-to-3D lifting. Naturally referring to neighboring frames, this design is highly robust in handling occlusions. However, 3D multi-person pose estimation is still challenging due to extreme data scarcity. From an observation that our 2D-to-3D lifting approach is free from pixel-level details, we propose a novel geometry-aware data augmentation that allows us to infinitely generate diverse training examples from existing single-person trajectories. From extensive experiments on standard benchmarks, we verify that our model and data augmentation method achieve the state-of-the-art, not just on accuracy but also on smoothness. We also qualitatively demonstrate the effectiveness of our approach both on public benchmarks and with in-the-wild videos.	-
dc.description.abstract	컴퓨터 비전에 기반한 3차원 자세 추정(3D Pose Estimation)은 매우 다양한 분야에 응용될 수 있기 때문에 큰 가치가 있다. 최근, 트랜스포머(Transformer) 모델 기반의 시퀀스-시퀀스(Sequence-tosequence) 모델인 MixSTE [60] 은 단일 객체(사람) 3차원 자세 추정에서 2차원 자세로부터의 3차원 자세 추정(2D-to-3D Lifting)의 방법을 활용하여 성공적인 결과를 거둔 바 있다. 본 연구는 이의 확장으로써 다중 객체 3차원 자세 문제를 다루며, 기존 연구와 비교해 등장하는 객체간 정보의 상호 참조(Inter-Personal Attention) 모듈을 새로이 추가하였다. 모델 구조에 기반하여 상호 인접 프레임 정보를 자연스럽게 참조함으로써, 본 연구에서 고안한 모델은 상호 가려짐 현상에 강인한 성능을 보였다. 하지만, 다중 객체 3차원 자세 추정은 데이터 부족 현상이라는 고질적인 문제를 지닌다. 본 연구의 방법론은 픽셀 수준의 디테일에서 벗어나, 2차원 자세와 3차원 자세 간의 관계를 다루기에, 주어진 데이터와 카메라 파라미터에 기반하여 데이터를 사실상 무제한적으로 증강할 수 있다는 강점을 지닌다. 본 분야에서 성능 측정 및 비교를 위한 대표적인 실험용 데이터셋에서 성능을 측정한 결과, 본 연구에서 고안한 모델은 정확도 뿐만 아니라 출력 결과의 부드러움 두 측면에서 모두 여타 기존 모델과 비교해 가장 훌륭한 성능을 보였다. 나아가, 테스트용 데이터셋 뿐만 아니라 다양한 시중 비디오에서도 훌륭한 성능을 보임으로써 연구의 상업적 가치 또한 입증하였다.	-
dc.description.tableofcontents	Chapter 1. Introduction 1 Chapter 2. Related Work 5 Chapter 3. Problem Formulation and Notations 8 Chapter 4. The POTR-3D Model 9 Chapter 5. Geometry-Aware Data Augmentation 16 Chapter 6. Experiments 22 Chapter 7. Summary 35 Bibliography 37 Abstract in Korean 44	-
dc.format.extent	ii, 44	-
dc.language.iso	eng	-
dc.publisher	서울대학교 대학원	-
dc.subject	3D	-
dc.subject	Human Pose	-
dc.subject	Augmentation	-
dc.subject	Sequence	-
dc.subject	Transformer	-
dc.subject.ddc	005	-
dc.title	Geometry-Aware Data Augmentation for Sequence-to-sequence Multi-Person 3D Pose Estimation	-
dc.title.alternative	시퀀스 기반 3차원 다인 자세 추정을 위한 기하학적 데이터 증강 기법	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	Sungchan Park	-
dc.contributor.department	데이터사이언스대학원 데이터사이언스학과	-
dc.description.degree	석사	-
dc.date.awarded	2023-02	-
dc.identifier.uci	I804:11032-000000174713	-
dc.identifier.holdings	000000000049▲000000000056▲000000174713▲	-

Appears in Collections:

Graduate School of Data Science (데이터사이언스 대학원)
- Theses (Master's Degree_데이터사이언스학과)

Files in This Item:

000000174713.pdf 1.04 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share