Publications

Detailed Information

Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation

Cited 1 time in Web of Science Cited 0 time in Scopus
Authors

Kim, Hyeongju; Lee, Hyeonseung; Kang, Woo Hyun; Kim, Hyung Yong; Kim, Nam Soo

Issue Date
2020-01
Publisher
IJCAI-INT JOINT CONF ARTIF INTELL
Citation
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, pp.3744-3750
Abstract
For multi-channel speech recognition, speech enhancement techniques such as denoising or dereverberation are conventionally applied as a front-end processor. Deep learning-based front-ends using such techniques require aligned clean and noisy speech pairs which are generally obtained via data simulation. Recently, several joint optimization techniques have been proposed to train the front-end without parallel data within an end-to-end automatic speech recognition (ASR) scheme. However, the ASR objective is sub-optimal and insufficient for fully training the front-end, which still leaves room for improvement. In this paper, we propose a novel approach which incorporates flow-based density estimation for the robust front-end using non-parallel clean and noisy speech. Experimental results on the CHIME-4 dataset show that the proposed method outperforms the conventional techniques where the front-end is trained only with ASR objective.
URI
https://hdl.handle.net/10371/186214
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share