Deep Neural Network-based Speech Enhancement with Subject Quality Measurement Model

김지환

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Deep Neural Network-based Speech Enhancement with Subject Quality Measurement Model : 음성 지표 측정 모델을 이용한 음성 향상 심층신경망

DC Field	Value	Language
dc.contributor.advisor	김남수	-
dc.contributor.author	김지환	-
dc.date.accessioned	2019-05-07T03:14:47Z	-
dc.date.available	2019-05-07T03:14:47Z	-
dc.date.issued	2019-02	-
dc.identifier.other	000000154794	-
dc.identifier.uri	https://hdl.handle.net/10371/150745	-
dc.description	학위논문 (석사)-- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2019. 2. 김남수.	-
dc.description.abstract	본 논문은 음성 지표 측정 모델을 이용한 심층신경망 음성 향상 기법을 다루고 있다. 기존의 심층신경망 음성 향상 기법은 목표 함수가 명료도 및 음질을 나타내는 지표와 관련성이 적기 때문에 한계성을 띠고 있었다. 이를 보완하기 위해 음성 향상 모델과 목표 함수가 음성 명료도 또는 음성 품질을 지표로 설정된 신경망 모델, 두 가지 모델을 연결하여 음성 향상을 시도하는 방향을 세웠다. 순수한 음성, 잡음이 섞인 음성, 향상된 음성 세 가지 경우에 대해 음성 지표를 측정한 뒤, 훈련을 통해 각각의 수치들을 측정하는 모델을 만들고, 이를 연결하여 음성 향상 모델을 훈련하는 것이다. 또한, 음성 향상 모델과 연결된 지표 측정 모델에서 출력되는 지표 값이 최대치가 되도록 훈련하는 과정에서 측정 모델의 뉴럴 네트워크 형태를 변화시키면서 최대치에 도달하는 속도 및 정확도를 향상하였다. 본 논문에서 음성 지표를 측정하는 데 사용된 지표는 STOI(short time objective intelligibility measure), PESQ(perceptual evaluation of speech quality) 두 가지이며, 이 두 가지 지표를 구하는 모델을 음성 향상 모델에 연결하는 방향으로 알고리즘을 구현한 뒤, 지표의 mean square error와 음성 feature의 mean square error 두 가지 값을 최소화하는 멀티 태스크 형식으로 훈련하였다. 모델을 검증한 결과 기존의 음성 향상 심층신경망에 비해 더 높은 지표값을 나타내는 것을 실험으로 확인하였다. 실험 결과에서는 PESQ 값과 STOI를 지표로 사용하였고, 기존 기법에서 사용하는 기저 행렬 보다 더 높은 성능을 보임을 확인하였다.	-
dc.description.abstract	This paper discusses in deep neural network speech enhancement techniques using subject quality measurement model. In conventional studies, there is an inconsistency between the model optimization criterion and the evaluation criterion on the enhanced speech. To compensate for the problem, we have established a direction to try to improve the enhancement efficiency by connecting two models: speech enhancement model and a neural network model with target functions as speech intelligibility or speech quality. To make this model, This model is trained by measuring subject qualities for three cases of clean speech, mixed speech and enhanced speech. In addition, in the course of training to maximize the quality value output from the subject quality measurement model associated with the speech enhancement model, by changing the shape of the measurement model's neutral network, the speed and accuracy at which the maximum is reached were improved. In this paper, there are two metrics used to measure subject qualities: short-time objective intelligibility measure (STOI), and perceptual evaluation of speech quality (PESQ), which have been trained and verified to show higher levels of speech enhancement algorithms in a multi-task format. The results of the experiment used PESQ values and STOI as indicators, and found that they performed better than the underlying model used by conventional techniques.	-
dc.description.tableofcontents	Abstract (In Korean) 4 Contents List of Tables ii List of Figures iii 1 Introduction 1 2 Conventional Approaches for Speech Enhancement 4 2.1 Deep Neural Network-based Speech Enhancement . . . . . . . . . . 4 2.1.1 Deep Neural Network . . . . . . . . . . . . . . . . . . . . . 4 2.1.2 Deep Neural Network-based Speech Enhancement Network . 7 3 Subject Quality Measurement 10 3.1 STOI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 PESQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 DNN-based Speech Enhancement using Subject Quality Measurement Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.1 Deep Neural Network-based Model . . . . . . . . . . . . . . 14 3.3.2 Convolutional Neural Network-based Model . . . . . . . . . 16 4 Proposed Enhancement Model 20 5 Experiment Design 22 5.1 Noisy Speech Mixtures . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2 SQM Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Neural Network Design . . . . . . . . . . . . . . . . . . . . . . . . . 24 6 Experimental Results 27 6.1 Subject Quality Measurement Models Performance . . . . . . . . . . 27 6.2 Speech Enhancement Models Performance using SQM Model as a Postfilter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 7 Conclusion and Future Work 34 Abstract i	-
dc.language.iso	eng	-
dc.publisher	서울대학교 대학원	-
dc.subject.ddc	621.3	-
dc.title	Deep Neural Network-based Speech Enhancement with Subject Quality Measurement Model	-
dc.title.alternative	음성 지표 측정 모델을 이용한 음성 향상 심층신경망	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	Kim Ji Hwan	-
dc.description.degree	Master	-
dc.contributor.affiliation	공과대학 전기·정보공학부	-
dc.date.awarded	2019-02	-
dc.contributor.major	음성신호처리	-
dc.identifier.uci	I804:11032-000000154794	-
dc.identifier.holdings	000000000026▲000000000039▲000000154794▲	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Master's Degree_전기·정보공학부)

Files in This Item:

000000154794.pdf 4.45 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share