Multi-scale Recurrent Encoder-Decoder Network for Dense Temporal Classification

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Multi-scale Recurrent Encoder-Decoder Network for Dense Temporal Classification

Cited 16 time in Web of Science Cited 16 time in Scopus

Citation: 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), pp.103-108

Abstract: The temporal events in video sequences often have long-term dependencies which are difficult to be handled by a convolutional neural network (CNN). Especially, the dense pixel-wise prediction of video frames is a difficult problem for the CNN because huge memories and a large number of parameters are needed to learn the temporal correlation. To overcome these difficulties, we propose a recurrent encoder-decoder network which compresses the spatiotemporal features at the encoder and restores them to the original sized results at the decoder. We adopt a convolutional long short-term memory (LSTM) into the encoder-decoder architecture, which successfully learns the spatiotemporal relation with relatively a small number of parameters. The proposed network is applied to one of the dense pixel-prediction problems, specifically, the background subtraction in video sequences. The proposed network is trained with limited duration of video frames, and yet it shows good generalization performance for different videos and time duration. Also, by additional video specific learning, it shows the best performance on a benchmark dataset (CDnet 2014).

Appears in Collections:

Show Full Item Record

Find it @ SNU

SNS Share