음악적 특징 레이블 기반 심볼릭 음악 생성 네트워크

Abstract: 자동 음악 생성 기술은 최근 몇 년 동안 매우 활발하게 연구되어 왔다. 그러나 이러한 연구에서 음악을 데이터로만 분석하는 것이 일반적이었고, 음악의 기반지식을 다루는 것은 생략되거나 어려운 작업으로 간주되었다. 특히 음악의 각 마디의 특성을 분석하고 이에 기반지식을 적용하는 과정은 사람의 작곡에서는 필수적임에도 불구하고, 자세하게 다루어지는 연구가 많지 않다. 우리는 마디의 음악적 특성과 선제 되는 음표 조건을 고려하여 각 마디를 생성함으로써 음악을 생성하는 모델을 제안한다.

먼저 심볼릭 음악 데이터를 상대적 음고 피아노 롤 표현(Relational Pitch Pianoroll Representation)으로 분석하여 피아노롤(Pianoroll) 기반 미디(MIDI) 인코딩 방법의 활용도를 높이고 생성된 결과를 음악적으로 광범위하게 사용할 수 있도록 하였으며, 이를 응용하여 다양한 이미지 기반 모델을 학습시켜 유의미한 결과를 얻어낼 수 있도록 하였다.

또한 심볼릭 음악 데이터 생성을 위해 다중 벡터 조건부 딥 컨볼루셔널 적대적 생성망(Multi-vector Conditional Deep Convolutional Generative Adversarial Network)를 사용하여 선제 되는 음표 조건과 음악적 특성 레이블을 반영하여 새로운 미디 데이터를 생성하도록 모델을 훈련했다. 또한 음악적 기술 레이블의 조합을 얻어내기 위하여 Long Short-Term Memory와 Gated Recurrent Unit을 활용하였다. 결과적으로 모델 FLAGNet은 다양한 음악적 요소를 고려하여 인상적인 심볼릭 음악을 생성할 수 있음을 보였다.
The technology for automatic music generation has been very actively studied in recent years. However, almost in these studies, handling domain knowledge of music was omitted or considered a difficult task. In particular, research that analyzes and applies the characteristics of each bar of music is rare, even though it is essential in human composition. We propose a model that generates music by handling the musical characteristics of bars and priming note conditions. We first analyze symbolic music data as piano-roll based method with a relational pitch approach, which increases the utilization of the piano-roll based MIDI encoding method and enables the use of generational results extensively. We have trained a model to generate these data with priming notes condition and musical skill label, by the multi-vector conditional deep convolutional generative adversarial network. The part related to the musical skill condition, we analyzed the good combination of the sequence of which characterized bars, simply done by Recurrent Neural Network with Long short Term Memory and Gated Recurrent Unit layer. While handling inputs like a minimum unit of note, length of music, or chart scales, the resulting model FLAGNet can generate impressive symbolic music.

Language: kor

URI: https://hdl.handle.net/10371/194100

https://dcollection.snu.ac.kr/common/orgView/000000174942

Files in This Item:

000000174942.pdf 7.19 MB

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Intelligence and Information (지능정보융합학과)
  - Theses (Master's Degree_지능정보융합학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share