Enriching Seed Information for Robust Interactive Image Segmentation

Abstract: Segmentation of an area corresponding to a desired object in an image is essential
to computer vision problems. This is because most algorithms are performed in
semantic units when interpreting or analyzing images. However, segmenting the
desired object from a given image is an ambiguous issue. The target object varies
depending on user and purpose. To solve this problem, an interactive segmentation
technique has been proposed. In this approach, segmentation was performed in the
desired direction according to interaction with the user. In this case, seed information
provided by the user plays an important role. If the seed provided by a user contain
abundant information, the accuracy of segmentation increases. However, providing
rich seed information places much burden on the users. Therefore, the main goal of
the present study was to obtain satisfactory segmentation results using simple seed
information.
We primarily focused on converting the provided sparse seed information to a rich
state so that accurate segmentation results can be derived. To this end, a minimum
user input was taken and enriched it through various seed enrichment techniques.
A total of three interactive segmentation techniques was proposed based on: (1)
Seed Expansion, (2) Seed Generation, (3) Seed Attention. Our seed enriching type
comprised expansion of area around a seed, generation of new seed in a new position,
and attention to semantic information.
First, in seed expansion, we expanded the scope of the seed. We integrated reliable
pixels around the initial seed into the seed set through an expansion step
composed of two stages. Through the extended seed covering a wider area than the
initial seed, the seed's scarcity and imbalance problems was resolved. Next, in seed
generation, we created a seed at a new point, but not around the seed. We trained
the system by imitating the user behavior through providing a new seed point in the
erroneous region. By learning the user's intention, our model could e ciently create
a new seed point. The generated seed helped segmentation and could be used as additional
information for weakly supervised learning. Finally, through seed attention,
we put semantic information in the seed. Unlike the previous models, we integrated
both the segmentation process and seed enrichment process. We reinforced the seed
information by adding semantic information to the seed instead of spatial expansion.
The seed information was enriched through mutual attention with feature maps
generated during the segmentation process.
The proposed models show superiority compared to the existing techniques
through various experiments. To note, even with sparse seed information, our proposed
seed enrichment technique gave by far more accurate segmentation results
than the other existing methods.
영상에서 원하는 물체 영역을 잘라내는 것은 컴퓨터 비전 문제에서 필수적인 요소이다. 영상을 해석하거나 분석할 때, 대부분의 알고리즘들이 의미론적인 단위 기반으로 동작하기 때문이다. 그러나 영상에서 물체 영역을 분할하는 것은 모호한 문제이다. 사용자와 목적에 따라 원하는 물체 영역이 달라지기 때문이다. 이를 해결하기 위해 사용자와의 교류를 통해 원하는 방향으로 영상 분할을 진행하는 대화형 영상 분할 기법이 사용된다. 여기서 사용자가 제공하는 시드 정보가 중요한 역할을 한다. 사용자의 의도를 담고 있는 시드 정보가 정확할수록 영상 분할의 정확도도 증가하게 된다. 그러나 풍부한 시드 정보를 제공하는 것은 사용자에게 많은 부담을 주게 된다. 그러므로 간단한 시드 정보를 사용하여 만족할만한 분할 결과를 얻는 것이 주요 목적이 된다.
우리는 제공된 희소한 시드 정보를 변환하는 작업에 초점을 두었다. 만약 시드 정보가 풍부하게 변환된다면 정확한 영상 분할 결과를 얻을 수 있기 때문이다. 그러므로 본 학위 논문에서는 시드 정보를 풍부하게 하는 기법들을 제안한다. 최소한의 사용자 입력을 가정하고 이를 다양한 시드 확장 기법을 통해 변환한다. 우리는 시드 확대, 시드 생성, 시드 주의 집중에 기반한 총 세 가지의 대화형 영상 분할 기법을 제안한다. 각각 시드 주변으로의 영역 확대, 새로운 지점에 시드 생성, 의미론적 정보에 주목하는 형태의 시드 확장 기법을 사용한다.
먼저 시드 확대에 기반한 기법에서 우리는 시드의 영역 확장을 목표로 한다. 두 단계로 구성된 확대 과정을 통해 처음 시드 주변의 비슷한 픽셀들을 시드 영역으로 편입한다. 이렇게 확장된 시드를 사용함으로써 시드의 희소함과 불균형으로 인한 문제를 해결할 수 있다. 다음으로 시드 생성에 기반한 기법에서 우리는 시드 주변이 아닌 새로운 지점에 시드를 생성한다. 우리는 오차가 발생한 영역에 사용자가 새로운 시드를 제공하는 동작을 모방하여 시스템을 학습하였다. 사용자의 의도를 학습함으로써 효과적으로 시드를 생성할 수 있다. 생성된 시드는 영상 분할의 정확도를 높일 뿐만 아니라 약지도학습을 위한 데이터로써 활용될 수 있다. 마지막으로 시드 주의 집중을 활용한 기법에서 우리는 의미론적 정보를 시드에 담는다. 기존에 제안한 기법들과 달리 영상 분할 동작과 시드 확장 동작이 통합된 모델을 제안한다. 시드 정보는 영상 분할 네트워크의 특징맵과 상호 교류하며 그 정보가 풍부해진다.
제안한 모델들은 다양한 실험을 통해 기존 기법 대비 우수한 성능을 기록하였다. 특히 시드가 부족한 상황에서 시드 확장 기법들은 훌륭한 대화형 영상 분할 성능을 보였다.

Language: eng

URI: https://hdl.handle.net/10371/175338

https://dcollection.snu.ac.kr/common/orgView/000000165828

Files in This Item:

000000165828.pdf 23.00 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Ph.D. / Sc.D._컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share