Exploiting Synthetic Data for Human Shape Estimation on Multi-Frame Inputs

Abstract: This paper describes new methods, called human shape estimators, that recover accurate 3D human shapes from 2D images. Owing to a lack of images with 3D annotations, diverse actors, and backgrounds, the reconstruction of 3D human mesh is a challenging problem. Previous studies reconstruct a 3D pose and a 3D shape in a human mesh simultaneously, but they have difficulty estimating an accurate shape. In this study, we focus on 3D body shapes using body representations in the Skinned Multi-Person Linear model (SMPL) format and leveraging large-scale synthetic human data from the SURREAL dataset. We successfully recover the shape features by taking advantage of prior knowledge in the parametric model and various shapes in the synthetic dataset. We also propose a multi-frame shape estimator that constructs the common shape from multiple images of different poses, textures, and backgrounds. By employing a Transformer encoder in the multi-frame method, we efficiently represent aggregated features and extract the common shape regardless of the number of input images. Our methods demonstrate the natural recovery of the invariant characteristic of human bodies and lay the cornerstone for the 3D mesh reconstruction of multiple images.
본 논문은 2차원 이미지로부터 3차원 인간의 체형을 정밀하게 복원하는 Human Shape Estimator라는 새로운 방법을 제안한다. 인간의 3차원 표면(mesh)을 복원하는 작업은 3차원 정보의 라벨 및 다양한 모델과 환경의 부족으로 인해 어려운 문제가 되었다. 기존의 연구는 이미지로부터 인간의 3차원 자세(pose)와 체형(shape)을 동시에 복원하였으나, 3차원 pose 대비 3차원 shape을 정확하게 추정하지는 못하였다. 본 연구에서는 Skinned Multi-Person Linear model (SMPL)의 신체 표현식과 SURREAL 데이터셋의 대규모 합성 데이터를 사용하여 인간의 3차원 shape을 복원하는 데 초점을 맞추었고, 따라서 본 논문이 제안하는 모델은 SMPL의 사전지식(prior knowledge)과 합성 데이터셋의 다양한 shape을 활용해 3차원 shape 특성을 성공적으로 추출하였다. 또한, 본 논문에서는 단일 이미지가 아닌 서로 다른 pose 및 질감(texture)과 배경을 가진 여러 장의 이미지로부터 공통 shape을 추출하는 Multi-Frame Shape Estimator를 설계하였다. Multi-Frame Method는 Transformer Encoder 구조를 채용하여 입력 이미지 수의 관계없이 이미지의 공통 특성(aggregated feature)을 효과적으로 표현하였다. 본 논문이 제안하는 방법은 shape으로 대표되는 3차원 신체의 불변적 특성을 추출하였으며, 다수의 이미지에 대한 3차원 mesh의 복원을 위한 초석이 됨을 보여주었다.

Language: eng

URI: https://hdl.handle.net/10371/193428

https://dcollection.snu.ac.kr/common/orgView/000000177057

Files in This Item:

000000177057.pdf 3.50 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Program in Artificial Intelligence (협동과정-인공지능전공)
  - Theses (Master's Degree_협동과정-인공지능전공)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share