Enhanced Acoustic Echo Suppression Techniques Based on Spectro-Temporal Correlations

이철민

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Enhanced Acoustic Echo Suppression Techniques Based on Spectro-Temporal Correlations : 주파수 및 시간적 상관관계에 기반한 음향학적 에코 억제 기법

DC Field	Value	Language
dc.contributor.advisor	김남수	-
dc.contributor.author	이철민	-
dc.date.accessioned	2017-07-13T07:15:49Z	-
dc.date.available	2017-07-13T07:15:49Z	-
dc.date.issued	2016-08	-
dc.identifier.other	000000136147	-
dc.identifier.uri	https://hdl.handle.net/10371/119199	-
dc.description	학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 8. 김남수.	-
dc.description.abstract	In the past decades, a number of approaches have been dedicated to acoustic echo cancellation and suppression which reduce the negative effects of acoustic echo, namely the acoustic coupling between the loudspeaker and microphone in a room. In particular, the increasing use of full-duplex telecommunication systems has led to the requirement of faster and more reliable acoustic echo cancellation algorithms. The solutions have been based on adaptive filters, but the length of these filters has to be long enough to consider most of the echo signal and linear filtering in these algorithms may be limited to remove the echo signal in various environments. In this thesis, a novel stereophonic acoustic echo suppression (SAES) technique based on spectral and temporal correlations is proposed in the short-time Fourier transform (STFT) domain. Unlike traditional stereophonic acoustic echo cancellation, the proposed algorithm estimates the echo spectra in the STFT domain and uses a Wiener filter to suppress echo without performing any explicit double-talk detection. The proposed approach takes account of interdependencies among components in adjacent time frames and frequency bins, which enables more accurate estimation of the echo signals. Due to the limitations of power amplifiers or loudspeakers, the echo signals captured in the microphones are not in a linear relationship with the far-end signals even when the echo path is perfectly linear. The nonlinear components of the echo cannot be successfully removed by a linear acoustic echo canceller. The remaining echo components in the output of acoustic echo suppression (AES) can be further suppressed by applying residual echo suppression (RES) algorithms. In this thesis, we propose an optimal RES gain estimation based on deep neural network (DNN) exploiting both the far-end and the AES output signals in all frequency bins. A DNN structure is introduced as a regression function representing the complex nonlinear mapping from these signals to the optimal RES gain. Because of the capability of the DNN, the spectro-temporal correlations in the full-band can be considered while finding the nonlinear function. The proposed method does not require any explicit double-talk detectors to deal with single-talk and double-talk situations. One of the well-known approaches for nonlinear acoustic echo cancellation is an adaptive Volterra filtering and various algorithms based on the Volterra filter were proposed to describe the characteristics of nonlinear echo and showed the better performance than the conventional linear filtering. However, the performance might be not satisfied since these algorithms could not consider the full correlation for the nonlinear relationship between the input signal and far-end signal in time-frequency domain. In this thesis, we propose a novel DNN-based approach for nonlinear acoustic echo suppression (NAES), extending the proposed RES algorithm. Instead of estimating the residual gain for suppressing the nonlinear echo components, the proposed algorithm straightforwardly recovers the near-end speech signal through the direct gain estimation obtained from DNN frameworks on the input and far-end signal. For echo aware training, a priori and a posteriori signal-to-echo ratio (SER) are introduced as additional inputs of the DNN for tracking the change of the echo signal. In addition, the multi-task learning (MTL) to the DNN-based NAES is combined to the DNN incorporating echo aware training for robustness. In the proposed system, an additional task of double-talk detection is jointly trained with the primary task of the gain estimation for NAES. The DNN can learn the good representations which can suppress more in single-talk periods and improve the gain estimates in double-talk periods through the MTL framework. Besides, the proposed NAES using echo aware training and MTL with double-talk detection makes the DNN be more robust in various conditions. The proposed techniques show significantly better performance than the conventional AES methods in both single- and double-talk periods. As a pre-processing of various applications such as speech recognition and speech enhancement, these approaches can help to transmit the clean speech and provide an acceptable communication in full-duplex real environments.	-
dc.description.tableofcontents	Chapter 1 Introduction 1 1.1 Background 1 1.2 Scope of thesis 3 Chapter 2 Conventional Approaches for Acoustic Echo Suppression 7 2.1 Single Channel Acoustic Echo Cancellation and Suppression 8 2.1.1 Single Channel Acoustic Echo Cancellation 8 2.1.2 Adaptive Filters for Acoustic Echo Cancellation 10 2.1.3 Acoustic Echo Suppression Based on Spectral Modication 11 2.2 Residual Echo Suppression 13 2.2.1 Spectral Feature-based Nonlinear Residual Echo Suppression 15 2.3 Stereophonic Acoustic Echo Cancellation 17 2.4 Wiener Filtering for Stereophonic Acoustic Echo Suppression 20 Chapter 3 Stereophonic Acoustic Echo Suppression Incorporating Spectro-Temporal Correlations 25 3.1 Introduction 25 3.2 Linear Time-Invariant Systems in the STFT Domain with Crossband Filtering 26 3.3 Enhanced SAES (ESAES) Utilizing Spectro-Temporal Correlations 29 3.3.1 Problem Formulation 31 3.3.2 Estimation of Extended PSD Matrices, Echo Spectra, and Gain Function 34 3.3.3 Complexity of the Proposed ESAES Algorithm 36 3.4 Experimental Results 37 3.5 Summary 41 Chapter 4 Nonlinear Residual Echo Suppression Based on Deep Neural Network 43 4.1 Introduction 43 4.2 A Brief Review on RES 45 4.3 Deep Neural Networks 46 4.4 Nonlinear RES using Deep Neural Network 49 4.5 Experimental Results 52 4.5.1 Combination with Stereophonic Acoustic Echo Suppression 59 4.6 Summary 61 Chapter 5 Enhanced Deep Learning Frameworks for Nonlinear Acoustic Echo Suppression 69 5.1 Introduction 69 5.2 DNN-based Nonlinear Acoustic Echo Suppression using Echo Aware Training 72 5.3 Multi-Task Learning for NAES 75 5.4 Experimental Results 78 5.5 Summary 82 Chapter 6 Conclusions 89 Bibliography 91 요약 101	-
dc.format	application/pdf	-
dc.format.extent	4996589 bytes	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 대학원	-
dc.subject	Acoustic echo cancellation	-
dc.subject	acoustic echo suppression	-
dc.subject	signal-to-echo ratio	-
dc.subject	spectro-temporal correlations	-
dc.subject	stereophonic acoustic echo suppression	-
dc.subject	residual echo suppression	-
dc.subject	nonlinear echo	-
dc.subject	deep neural networks	-
dc.subject	optimal gain regression	-
dc.subject	adaptive ltering	-
dc.subject	nonlinear acoustic echo suppression	-
dc.subject	echo aware training	-
dc.subject	multi-task learning	-
dc.subject.ddc	621	-
dc.title	Enhanced Acoustic Echo Suppression Techniques Based on Spectro-Temporal Correlations	-
dc.title.alternative	주파수 및 시간적 상관관계에 기반한 음향학적 에코 억제 기법	-
dc.type	Thesis	-
dc.contributor.AlternativeAuthor	Chul Min Lee	-
dc.description.degree	Doctor	-
dc.citation.pages	xiv, 102	-
dc.contributor.affiliation	공과대학 전기·컴퓨터공학부	-
dc.date.awarded	2016-08	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Ph.D. / Sc.D._전기·정보공학부)

Files in This Item:

000000136147.pdf 4.77 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share