Consistency & Interpolation-based Semi-supervised learning for Object Detection

Abstract: Object detection, one of the main areas of computer vision researches, is a task that predicts where and what the objects are in an RGB image. While the object detection task requires a massive number of annotated samples to guarantee its performance, placing bounding boxes for every object in each sample is costly and time consuming. To alleviate this problem, Weakly-Supervised Learning and Semi-Supervised Learning methods have been proposed. However, they show large gaps from supervised learning in efficiency and require a lot of research. Especially in Semi-Supervised Learning, the deep learning-based learning methods are not yet applied to object detection.

In this dissertation, we have applied the latest deep learning-based Semi-Supervised Learning methods to object detection, which considers and solves the problems caused by applying the established Semi-Supervised Learning algorithms. Specifically, we have adopted Consistency Regularization (CR) and Interpolation Regularization (IR) Semi-Supervised Learning methods to object detection individually and combined them together for performance improvement. It is the first attempt to extend CR and IR to object detection problem which was only used in conventional semi-supervised classification problems

First, we propose a novel Consistency-based Semi-Supervised Learning method for object Detection (CSD), which is a way of using consistency constraints to enhance detection performance by making full use of available unlabeled data. To be specific, the consistency constraint is applied not only for object classification but also for localization. We also propose Background Elimination (BE) to avoid the negative effect of the predominant backgrounds on the detection performance. We evaluated the proposed CSD both in single-stage and two-stage detectors, and the results show the effectiveness of our method.

Second, we present a novel Interpolation-based Semi-Supervised Learning method for object Detection (ISD), which considers and solves the problems caused by applying conventional Interpolation Regularization (IR) directly to object detection. We divide the output of the model into two types according to the objectness scores of both original patches that are mixed in IR. Then, we apply a separate loss suitable for each type in an unsupervised manner. The proposed losses dramatically improve the performance of Semi-Supervised Learning as well as supervised learning.

Third, we introduce the method of combining CSD and ISD. In CSD, it requires an additional prediction for applying consistency regularization, and it allocates twice (x2) as much memory as conventional supervised learning. In ISD, in addition, two supplementary predictions are computed for applying interpolation regularization, and it takes three times (x3) as much memory as conventional training. Therefore, it requires three extra predictions to combine CSD and ISD. In our method, by applying shuffle the sample in mini-batch in CSD, we reduced the additional predictions from three to two, which can cut back the memory. Furthermore, combining two algorithms shows performance improvement.
객체 검출 알고리즘은 RGB 이미지에서 어느 위치에 어떤 객체가 있는지를 검출하는 것으로, 컴퓨터 비전 분야에서 가장 중요한 연구분야 중 하나이다. 하지만, 이러한 객체 검출 알고리즘을 위해서는 잘 레이블링된 큰 데이터 셋을 필요로 하고, 이러한 레이블링은 매우 많은 비용과 시간을 필요로 한다. 위와 같은 문제를 해결하기 위하여 약한 지도학습 (Weakly Supervised Learning), 준지도 학습 (Semi Supervised Learning)의 방법들이 연구되고 있으나, 그 연구가 많지 않고, 준지도 학습의 경우, 최신 딥러닝 기반의 학습방법들이 적용되지 않고 있었다.

본 논문에서는 최신의 딥러닝 기반의 준지도 학습 방법들을 객체 검출 알고리즘에 적용하였고, 여기서 발생하는 문제들을 발견하고 해결하는 방법을 연구하였다. 구체적으로 일관성 정규화 (Consistency Regularization), 보간법 정규화 (Interpolation Regularization) 기반의 준지도 학습 방법을 제시하였고, 최종적으로 이 둘을 합치는 방법을 제시하였다. 이는 기존의 분류문제에서 사용되는 CR 과 IR 을 객체검출 알고리즘 문제에 처음으로 확장한 것이다.

첫 번째로, 우리는 객체 검출 알고리즘을 위한 일관성 정규화 기반의 준지도 학습방법 (CSD)을 제안하였다. 이는 정규화 제약을 사용하여 레이블링이 없는 모든 데이터를 활용하여 객체 검출 성능을 향상시키는 방법이다. 구체적으로 우리는 정규화 제약을 분류뿐만 아니라 회귀에 대해서도 적용하였다. 게다가, 우리는 한 이미지 내에서 대부분의 영역을 차지하는 배경 부분의 영향을 줄이기 위하여 배경 제거 (Background Elimination) 을 적용하였다. 우리는 제안한 CSD 를 싱글 단계 (Single-Stage)와 두 단계(Two-Stage) 검출기에 모두 적용하여 평가하였고, 결과들은 우리의 알고리즘의 효과를 보였다.

두번째로, 우리는 객체 검출 알고리즘을 위한 보간법 정규화 (IR) 기반의 준지도 학습방법 (ISD)을 제안하였다. 우리는 보간법 정규화를 객체 검출 알고리즘에 바로 적용시켰을 때 생기는 문제들을 고려하고 해결하였다. 우리는 두 원본 패치에서의 객체 확률에 따라 모델의 출력을 두개의 타입으로 나누었다. 그리고, 우리는 각각의 타입에 따라 각각에 맞는 손실 함수를 정의하였다. 제안한 알고리즘은 지도학습뿐만 아니라 준지도 학습에서도 매우 큰 성능향상을 보였다.

마지막으로, 우리는 위의 CSD 와 ISD 의 경합하는 방법을 소개하였다. CSD에서는 일관성 정규화를 적용하기 위하여 한번의 추가적인 연산을 필요로 하고, 이는 기존의 지도학습에 비해 2배의 메모리를 필요로 한다. ISD의 경우, 보간법 정규화를 적용하기 위하여 두번의 추가적인 연산을 필요로 하고, 이는 3배의 메모리를 필요로 한다. 그러므로, 두 알고리즘을 결합하기 위해서는 세번의 추가적인 결과값이 필요하다. 우리는 CSD 미니배치의 샘플들을 섞는 방법을 적용하였고, 이는 추가적인 연산을 세번에서 두번으로 줄여 메모리의 소모를 줄일 수 있었다. 또한, 이 두 알고리즘을 합쳐서 모델의 성능이 향상됨을 보였다.

Language: eng

URI: https://hdl.handle.net/10371/177548

https://dcollection.snu.ac.kr/common/orgView/000000167270

Files in This Item:

000000167270.pdf 31.83 MB

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Transdisciplinary Studies(융합과학부)
  - Theses (Ph.D. / Sc.D._융합과학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share