Publications

Detailed Information

Generative Topic Model Using Variational Bayesian Inference : 변분 추론을 통한 주제 생성 모형에 관한 연구

DC Field Value Language
dc.contributor.advisor최형인-
dc.contributor.author정남규-
dc.date.accessioned2018-05-28T17:11:49Z-
dc.date.available2018-05-28T17:11:49Z-
dc.date.issued2018-02-
dc.identifier.other000000150103-
dc.identifier.urihttps://hdl.handle.net/10371/141143-
dc.description학위논문 (박사)-- 서울대학교 대학원 : 자연과학대학 수리과학부, 2018. 2. 최형인.-
dc.description.abstractIn this thesis, we will propose a new model which is a continuous extension of the LDA and discuss about some applications using this model. The newly proposed model is called Continuous Semantic Topic Embedding Model(CSTEM) based on the Latent Dirichlet Allocation model with continuous assumption for the word-topic distribution. This assumption leads to the introduction of new parameter for the probabilistic model playing role as a global parameter which reflects how likely a certain word occurs in a document regardless of the topic variable.
We will verify our model from the various points of view and insist that this model outperforms other topic models and is worth to use for some appropriate purposes. The verifications will be done via experiments using various corpora. And we will show that this model can be applied to other applications such as an analysis of the time-dependent topic model, which is helpful to analyze the trend of the topics over time intuitively.
-
dc.description.tableofcontents1 Introduction 1
1.1 Literature Review 3
1.2 Motivations 5
1.3 Contributions 6
1.4 Thesis Overview 7
2 Background 9
2.1 Latent Dirichlet Allocation 9
2.1.1 Variational Bayesian Inference 13
2.1.2 Markov Chain Monte Carlo 17
2.2 Gaussian LDA 23
2.3 Auto-encoding Variational Bayes 25
2.4 Topic coherence 27
3 Continuous Semantic Topic Embedding 30
3.1 Methodology 30
3.1.1 Continuous Semantic Distance Function 30
3.1.2 Global Weight Parameter of Words 32
3.1.3 Continuous Word-Topic Distribution 33
3.1.4 Inference Using Variational Autoencoder 34
3.2 Experiments 38
3.2.1 Practical Issues 38
3.2.2 Data. 39
3.2.3 Description 39
3.2.4 Measures 40
3.3 Results 41
3.3.1 Perplexity 41
3.3.2 Topic coherence 42
3.3.3 Continuous Embedding 44
3.4 Summary 47
4 Time-Dependent Continuous Topic Model 48
4.1 Methodology 48
4.1.1 Topic Over Time 48
4.1.2 Variational Auto-encoding Topic Over Time 51
4.1.3 Time-Dependent CSTEM 52
4.2 Experiments 54
4.3 Results 55
5 Conclusion 58
5.1 Summary 58
5.2 Future Works 59
Bibliography 60
A Some Appendices 67
A.1 Word-Topic Distributions 67
A.2 Calculation of KL-Divergence 71
Abstract (in Korean) 73
-
dc.formatapplication/pdf-
dc.format.extent4068742 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectTopic model-
dc.subjectNatural Language Process-
dc.subjectGenerative Model-
dc.subjectVariational inference-
dc.subjectNeural Network-
dc.subject.ddc510-
dc.titleGenerative Topic Model Using Variational Bayesian Inference-
dc.title.alternative변분 추론을 통한 주제 생성 모형에 관한 연구-
dc.typeThesis-
dc.contributor.AlternativeAuthorNamkyu Jung-
dc.description.degreeDoctor-
dc.contributor.affiliation자연과학대학 수리과학부-
dc.date.awarded2018-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share