Publications

Detailed Information

Zero-Shot Single-Microphone Sound Classification and Localization in a Building Via the Synthesis of Unseen Features

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

Lee, Seungjun; Yang, Haesang; Choi, Hwiyong; Seong, Woojae

Issue Date
2022
Publisher
Institute of Electrical and Electronics Engineers
Citation
IEEE Transactions on Multimedia, Vol.24, pp.2339-2351
Abstract
In this paper, we propose a learning-based approach to identify the type and position of sounds using a single microphone in a real-world building. We attempt to treat this problem as a joint classification problem in which we predict the exact positions of sounds while classifying the types that are assumed to be from pre-defined types of sounds. The most problematic issue is that while the types are readily classified under supervised learning frameworks with one-hot encoded labels, it is difficult to predict the exact positions of the sound from unseen positions during training. To address this potential discrepancy, we formulate the position identification problem as a zero-shot learning problem inspired by the human ability to perceive new concepts from previously learned concepts. We extract feature representations from audio data and vectorize the type and position of the sound source as 'type/position-aware attributes,' instead of labeling each class with a simple one-hot vector. We then train a promising generative model to bridge the extracted features and the attributes by learning the class-invariant structure to transfer the knowledge from seen to unseen classes through their attributes; generative adversarial networks are conditioned on the class-embeddings. Our proposed methods are evaluated on an indoor noise dataset, SNU-B36-EX, a real-world dataset collected inside a building.
ISSN
1520-9210
URI
https://hdl.handle.net/10371/184144
DOI
https://doi.org/10.1109/TMM.2021.3079705
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share