Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation

Lee, Jungbeom; Kim, Eunji; Lee, Sungmin; Lee, Jangho; Yoon, Sungroh

doi:10.1109/ICCV.2019.00691

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation

DC Field	Value	Language
dc.contributor.author	Lee, Jungbeom	-
dc.contributor.author	Kim, Eunji	-
dc.contributor.author	Lee, Sungmin	-
dc.contributor.author	Lee, Jangho	-
dc.contributor.author	Yoon, Sungroh	-
dc.date.accessioned	2022-10-26T07:23:42Z	-
dc.date.available	2022-10-26T07:23:42Z	-
dc.date.created	2022-10-19	-
dc.date.issued	2019-02	-
dc.identifier.citation	2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), pp.6807-6817	-
dc.identifier.issn	1550-5499	-
dc.identifier.uri	https://hdl.handle.net/10371/186972	-
dc.description.abstract	When a deep neural network is trained on data with only image-level labeling, the regions activated in each image tend to identify only a small region of the target object. We propose a method of using videos automatically harvested from the web to identify a larger region of the target object by using temporal information, which is not present in the static image. The temporal variations in a video allow different regions of the target object to be activated. We obtain an activated region in each frame of a video, and then aggregate the regions from successive frames into a single image, using a warping technique based on optical flow. The resulting localization maps cover more of the target object, and can then be used as proxy ground-truth to train a segmentation network. This simple approach outperforms existing methods under the same level of supervision, and even approaches relying on extra annotations. Based on VGG-16 and ResNet 101 backbones, our method achieves the mIoU of 65.0 and 67.4, respectively, on PASCAL VOC 2012 test images, which represents a new state-of-the-art.	-
dc.language	영어	-
dc.publisher	IEEE	-
dc.title	Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation	-
dc.type	Article	-
dc.identifier.doi	10.1109/ICCV.2019.00691	-
dc.citation.journaltitle	2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019)	-
dc.identifier.wosid	000548549201093	-
dc.identifier.scopusid	2-s2.0-85081911329	-
dc.citation.endpage	6817	-
dc.citation.startpage	6807	-
dc.description.isOpenAccess	N	-
dc.contributor.affiliatedAuthor	Yoon, Sungroh	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Journal Papers (저널논문_전기·정보공학부)

Files in This Item:: There are no files associated with this item.

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share