Publications

Detailed Information

Musical Instrument Identification and Tone Detection Using Feature Learning : 특징 학습을 통한 악기 식별과 음색 분류

DC Field Value Language
dc.contributor.advisor이교구-
dc.contributor.author한윤창-
dc.date.accessioned2017-07-14T01:49:16Z-
dc.date.available2017-07-14T01:49:16Z-
dc.date.issued2017-02-
dc.identifier.other000000142580-
dc.identifier.urihttps://hdl.handle.net/10371/122373-
dc.description학위논문 (박사)-- 서울대학교 대학원 : 융합과학부, 2017. 2. 이교구.-
dc.description.abstractIn music information retrieval (MIR) field, most of the tasks have been heavily relying on the hand-crafted features which are highly useful to measure and quantify the certain target characteristics of the sound such as pitch, roughness, and brightness. However, there are an increasing number of attempts on applying feature learning technique, which has shown superior performance across the research fields, in MIR tasks especially when the goal is the identification of the sound. The aim of this thesis is to advance the state-of-the-art in musical instrument identification and tone detection tasks with feature learning approaches, which can be used for various music-related applications, including but not limited to, music searching, browsing, recommendation, and education. We utilize sparse feature learning and convolutional neural network to learn a feature from input data, and propose the network architecture and data processing framework that is suitable for music signal. We present experimental results on MIR tasks such as fingering detection of overblown flute sound, instrument identification in monophonic sound, and predominant instrument recognition in a real-world polyphonic music. In addition, we conducted an extensive experiment to find the optimal data processing techniques and parameters such as input frame sampling, frequency scaling, activation pooling, window/hop size, output aggregation method, and network training hyperparameters.-
dc.description.tableofcontentsChapter 1 Introduction 1
1.1 Music Information Retrieval 2
1.2 Musical Instrument Identification and Tone Detection 5
1.3 Related Works 7
1.4 Tasks of Interest 9
1.5 Contributions 11
Chapter 2 Overview of MIR Systems for Identification Tasks 16
2.1 Input Data Representation 17
2.1.1 Time Domain Representation 17
2.1.2 Time-frequency Representation 18
2.1.3 Cepstral Domain Representation 19
2.2 Feature Extraction 20
2.2.1 Conventional Hand-crafted Features 21
2.2.2 Feature Learning Approaches 22
2.3 Preprocessing 24
2.4 Feature Learning Algorithms 25
2.4.1 Sparse Feature Learning 26
2.4.2 Convolutional Neural Network 28
Chapter 3 Detecting Fingering of Overblown Flute Sound 30
3.1 Introduction 30
3.2 Existing Approaches 33
3.3 Spectral Characteristics of Overblown Tone 33
3.4 System Architecture 35
3.4.1 Preprocessing 36
3.4.2 Feature Learning Strategy 37
3.4.3 Max-pooling 38
3.4.4 Classification 38
3.5 Evaluation 39
3.5.1 Dataset Specification 39
3.5.2 Experiment Settings 40
3.6 Results 42
3.6.1 Comparison with MFCCs 42
3.6.2 Effect of Max-pooling 44
3.6.3 Effect of the Number of Hidden Units 44
3.6.4 Effect of Classifier 45
3.6.5 Comparison with PCA/LDA method 46
3.7 Conclusions 48
Chapter 4 Instrument Identification in Monophonic Sound 50
4.1 Introduction 50
4.2 Data Processing Pipeline 52
4.2.1 Preprocessing 52
4.2.2 Frame Sampling Methods for Dictionary Learning 53
4.2.3 Activation Pooling 55
4.2.4 Classification 57
4.3 Evaluation 58
4.3.1 Dataset Specification 58
4.3.2 Experiment Settings 59
4.4 Results 61
4.4.1 Effect of the Sampling Method 61
4.4.2 Effect of the Pooling Method 62
4.4.3 Effect of the DFT size 65
4.4.4 Effect of the Dictionary Size 67
4.4.5 Effect of Frequency Scaling 68
4.4.6 Comparison to MFCCs 68
4.5 Discussion 68
4.5.1 Sampling and Pooling Method 68
4.5.2 DFT Size and the Dictionary Size 70
4.5.3 Frequency Scaling and Comparison to MFCCs 70
4.5.4 Comparison to Human Performance and Limitation of Research 71
4.6 Conclusions 72
Chapter 5 Predominant Instrument Recognition in Polyphonic Music 75
5.1 Introduction 75
5.2 Proliferation of Deep Neural Networks in Music Information Retrieval 77
5.3 System Architecture 79
5.3.1 Audio Preprocessing 79
5.3.2 Network Architecture 80
5.3.3 Training Configuration 82
5.3.4 Activation Function 83
5.4 Evaluation 85
5.4.1 IRMAS Dataset 85
5.4.2 Testing Configuration 86
5.4.3 Performance Evaluation 89
5.5 Results 90
5.5.1 Effect of Analysis Window Size 91
5.5.2 Effect of Max-pooling and Model Ensemble 92
5.5.3 Effect of Activation Function 95
5.5.4 Comparison to Existing Algorithms 96
5.5.5 Analysis of Instrument-Wise Identification Performance 98
5.5.6 Analysis on Single Predominant Instrument Identification 100
5.5.7 Qualitative Analysis with Visualization Methods 102
5.6 Conclusions 105
Chapter 6 Conclusions and Future Work 109
초록 134
-
dc.formatapplication/pdf-
dc.format.extent6503772 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectmusic information retrieval-
dc.subjectdeep learning-
dc.subjectconvolutional neural network-
dc.subjectsparse feature learning-
dc.subjectmusical instrument identification-
dc.subject.ddc620-
dc.titleMusical Instrument Identification and Tone Detection Using Feature Learning-
dc.title.alternative특징 학습을 통한 악기 식별과 음색 분류-
dc.typeThesis-
dc.contributor.AlternativeAuthorYoonchang Han-
dc.description.degreeDoctor-
dc.citation.pages135-
dc.contributor.affiliation융합과학기술대학원 융합과학부-
dc.date.awarded2017-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share