Automatic Transcription of Singing Voice Signals

허훈

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Automatic Transcription of Singing Voice Signals : 노래 신호의 자동 전사

DC Field	Value	Language
dc.contributor.advisor	이교구	-
dc.contributor.author	허훈	-
dc.date.accessioned	2017-10-27T17:02:46Z	-
dc.date.available	2017-10-27T17:02:46Z	-
dc.date.issued	2017-08	-
dc.identifier.other	000000145578	-
dc.identifier.uri	https://hdl.handle.net/10371/137039	-
dc.description	학위논문 (박사)-- 서울대학교 융합과학기술대학원 융합과학부, 2017. 8. 이교구.	-
dc.description.abstract	Automatic music transcription refers to an automatic extraction of musical attributes such as notes from an audio signal to a symbolic level. The symbolized music data are applicable for various purposes such as music education and production by providing higher-level information to both consumers and creators. Although the singing voice is the easiest one to listen and play among various music signals, traditional transcription methods for musical instruments are not suitable due to the acoustic complexity in the human voice. The main goal of this thesis is to develop a fully-automatic singing transcription system that exceeds existing methods. We first take a look at some typical approaches for pitch tracking and onset detection, which are two fundamental tasks of music transcription, and then propose several methods for each task. In terms of pitch tracking, we examine the effect of data sampling on the performance of periodicity analysis of music signals. For onset detection, the local homogeneity in the harmonic structure is exploited through the cepstral analysis and unsupervised classification. The final transcription system includes feature extraction and probabilistic model of the harmonic structure, and note transition based on the hidden Markov model. It achieved the best performance (an F-measure of 82%) in the note-level evaluation including the state-of-the-art systems.	-
dc.description.tableofcontents	Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Definitions 5 1.2.1 Musical keywords 5 1.2.2 Scientific keywords 7 1.2.3 Representations 7 1.3 Problems in singing transcription 9 1.4 Topics of interest 10 1.5 Outline of the thesis 13 Chapter 2 Background 16 2.1 Pitch estimation 17 2.1.1 Time-domain methods 17 2.1.2 Frequency-domain methods 18 2.2 Note segmentation 20 2.2.1 Onset detection 20 2.2.2 Offset detection 23 2.3 Singing transcription 24 2.4 Evaluation methodology 26 2.4.1 Pitch estimation 26 2.4.2 Note segmentation 27 2.4.3 Dataset 28 2.5 Summary 31 Chapter 3 Periodicity Analysis by Sampling in the Time/Frequency Domain for Pitch Tracking 32 3.1 Introduction 32 3.2 Data sampling 34 3.3 Sampled ACF/DF in the time domain 37 3.4 Sampled ACF/DF in the frequency domain 38 3.5 Iterative F0 estimation 40 3.6 Experimental setup 42 3.7 Result 46 3.8 Summary 49 Chapter 4 Note Onset Detection based on Harmonic Cepstrum regularity 50 4.1 Introduction 50 4.2 Cepstral analysis 52 4.3 Harmonic cepstrum regularity 56 4.3.1 Harmonic quefrency selection 57 4.3.2 Sub-harmonic regularity function 58 4.3.3 Adaptive thresholding 59 4.3.4 Picking onsets 59 4.4 Experiments 61 4.4.1 Dataset description 61 4.4.2 Evaluation results 62 4.5 Summary 64 Chapter 5 Robust Singing Transcription System using Local Homogeneity in the Harmonic Structure 66 5.1 Introduction 66 5.2 F0 tracking 71 5.3 Feature extraction 72 5.4 Mixture model 76 5.5 Note detection 80 5.5.1 Transition boundary detection 81 5.5.2 Note boundary selection 83 5.5.3 Note pitch decision 84 5.6 Evaluation 86 5.6.1 Dataset 86 5.6.2 Criteria and measures 87 5.6.3 Experimental setup 89 5.7 Results and discussions 90 5.7.1 Failure analysis 95 5.8 Summary 97 Chapter 6 Conclusion and Future Work 99 6.1 Contributions 99 6.2 Future work 103 6.2.1 Precise partial tracking using instantaneous frequency 103 6.2.2 Linguistic model for note segmentation 105 Appendix 108 Derivation of the instantaneous frequency 108 Bibliography 110 초 록 124	-
dc.format	application/pdf	-
dc.format.extent	6443188 bytes	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 융합과학기술대학원	-
dc.subject	automatic music transcription	-
dc.subject	music information retrieval	-
dc.subject	onset detection	-
dc.subject	pitch estimation	-
dc.subject	singing voice	-
dc.subject	harmonic structure	-
dc.subject.ddc	620.5	-
dc.title	Automatic Transcription of Singing Voice Signals	-
dc.title.alternative	노래 신호의 자동 전사	-
dc.type	Thesis	-
dc.description.degree	Doctor	-
dc.contributor.affiliation	융합과학기술대학원 융합과학부	-
dc.date.awarded	2017-08	-

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Transdisciplinary Studies(융합과학부)
  - Theses (Ph.D. / Sc.D._융합과학부)

Files in This Item:

000000145578.pdf 6.14 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share