Neural networks for compressing and classifying speaker-independent paralinguistic signals

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Neural networks for compressing and classifying speaker-independent paralinguistic signals

Cited 3 time in Web of Science Cited 2 time in Scopus

Citation: 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), pp.311-314

Abstract: Recognizing and classifying paralinguistic signals, with its various applications, is an important problem. In general, this task is considered challenging because the sound information from the signals is difficult to distinguish even by humans. Thus, analyzing signals with machine learning techniques is a reasonable approach to understanding signals. Audio features extracted from paralinguistic signals usually consist of high-dimensional vectors such as prosody, energy, cepstrum, and other speech-related information. Therefore, when the size of a training corpus is not sufficiently large, it is extremely difficult to apply machine learning methods to analyze these signals due to their high feature dimensions. This paper addresses these limitations by using neural networks' feature learning abilities. First, we use a neural network-based autoencoder to compress the signal to eliminate redundancy within the signal feature, and we show that the compressed signal features are competitive in distinguishing the signal compared to the original features. Second, we show by experiment that the neural network-based classification model almost always outperforms nonneural methods such as logistic regression, support vector machines, decision trees, and boosted trees.

Appears in Collections:

Show Full Item Record

Find it @ SNU

SNS Share