Publications

Detailed Information

URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark

Cited 0 time in Web of Science Cited 48 time in Scopus
Authors

Seo, Seonguk; Lee, Joon-Young; Han, Bohyung

Issue Date
2020-08
Publisher
Springer Science and Business Media Deutschland GmbH
Citation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol.12360 LNCS, pp.208-223
Abstract
We propose a unified referring video object segmentation network (URVOS). URVOS takes a video and a referring expression as inputs, and estimates the object masks referred by the given language expression in the whole video frames. Our algorithm addresses the challenging problem by performing language-based object segmentation and mask propagation jointly using a single deep neural network with a proper combination of two attention models. In addition, we construct the first large-scale referring video object segmentation dataset called Refer-Youtube-VOS. We evaluate our model on two benchmark datasets including ours and demonstrate the effectiveness of the proposed approach. The dataset is released at https://github.com/skynbe/Refer-Youtube-VOS.
ISSN
0302-9743
URI
https://hdl.handle.net/10371/197919
DOI
https://doi.org/10.1007/978-3-030-58555-6_13
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share