Acoustic Modeling using Adversarially Trained Variational Recurrent Neural Network for Speech Synthesis

Lee, Joun Yeop; Cheon, Sung Jun; Choi, Byoung Jin; Kim, Nam Soo; Song, Eunwoo

doi:10.21437/Interspeech.2018-1598

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Acoustic Modeling using Adversarially Trained Variational Recurrent Neural Network for Speech Synthesis

DC Field	Value	Language
dc.contributor.author	Lee, Joun Yeop	-
dc.contributor.author	Cheon, Sung Jun	-
dc.contributor.author	Choi, Byoung Jin	-
dc.contributor.author	Kim, Nam Soo	-
dc.contributor.author	Song, Eunwoo	-
dc.date.accessioned	2022-10-26T07:22:06Z	-
dc.date.available	2022-10-26T07:22:06Z	-
dc.date.created	2022-10-21	-
dc.date.issued	2018-09	-
dc.identifier.citation	19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, pp.917-921	-
dc.identifier.issn	2308-457X	-
dc.identifier.uri	https://hdl.handle.net/10371/186836	-
dc.description.abstract	In this paper, we propose a variational recurrent neural network (VRNN) based method for modeling and generating speech parameter sequences. In recent years. the performance of speech synthesis systems has been improved over conventional techniques thanks to deep learning-based acoustic models. Among the popular deep learning techniques, recurrent neural networks (RNNs) has been successful in modeling time-dependent sequential data efficiently. However, due to the deterministic nature of RNNs prediction, such models do not reflect the full complexity of highly structured data, like natural speech. In this regard, we propose adversarially trained variational recurrent neural network (AdVRNN) which use VRNN to better represent the variability of natural speech for acoustic modeling in speech synthesis. Also, we apply adversarial learning scheme in training AdVRNN to overcome oversmoothing problem. We conducted comparative experiments for the proposed VRNN with the conventional gated recurrent unit which is one of RNNs, for speech synthesis system. It is shown that the proposed AdVRNN based method performed better than the conventional GRU technique.	-
dc.language	영어	-
dc.publisher	ISCA-INT SPEECH COMMUNICATION ASSOC	-
dc.title	Acoustic Modeling using Adversarially Trained Variational Recurrent Neural Network for Speech Synthesis	-
dc.type	Article	-
dc.identifier.doi	10.21437/Interspeech.2018-1598	-
dc.citation.journaltitle	19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES	-
dc.identifier.wosid	000465363900191	-
dc.identifier.scopusid	2-s2.0-85054971121	-
dc.citation.endpage	921	-
dc.citation.startpage	917	-
dc.description.isOpenAccess	N	-
dc.contributor.affiliatedAuthor	Kim, Nam Soo	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Journal Papers (저널논문_전기·정보공학부)

Files in This Item:: There are no files associated with this item.

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share