Publications
Detailed Information
Generative Topic Model Using Variational Bayesian Inference : 변분 추론을 통한 주제 생성 모형에 관한 연구
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 최형인 | - |
dc.contributor.author | 정남규 | - |
dc.date.accessioned | 2018-05-28T17:11:49Z | - |
dc.date.available | 2018-05-28T17:11:49Z | - |
dc.date.issued | 2018-02 | - |
dc.identifier.other | 000000150103 | - |
dc.identifier.uri | https://hdl.handle.net/10371/141143 | - |
dc.description | 학위논문 (박사)-- 서울대학교 대학원 : 자연과학대학 수리과학부, 2018. 2. 최형인. | - |
dc.description.abstract | In this thesis, we will propose a new model which is a continuous extension of the LDA and discuss about some applications using this model. The newly proposed model is called Continuous Semantic Topic Embedding Model(CSTEM) based on the Latent Dirichlet Allocation model with continuous assumption for the word-topic distribution. This assumption leads to the introduction of new parameter for the probabilistic model playing role as a global parameter which reflects how likely a certain word occurs in a document regardless of the topic variable.
We will verify our model from the various points of view and insist that this model outperforms other topic models and is worth to use for some appropriate purposes. The verifications will be done via experiments using various corpora. And we will show that this model can be applied to other applications such as an analysis of the time-dependent topic model, which is helpful to analyze the trend of the topics over time intuitively. | - |
dc.description.tableofcontents | 1 Introduction 1
1.1 Literature Review 3 1.2 Motivations 5 1.3 Contributions 6 1.4 Thesis Overview 7 2 Background 9 2.1 Latent Dirichlet Allocation 9 2.1.1 Variational Bayesian Inference 13 2.1.2 Markov Chain Monte Carlo 17 2.2 Gaussian LDA 23 2.3 Auto-encoding Variational Bayes 25 2.4 Topic coherence 27 3 Continuous Semantic Topic Embedding 30 3.1 Methodology 30 3.1.1 Continuous Semantic Distance Function 30 3.1.2 Global Weight Parameter of Words 32 3.1.3 Continuous Word-Topic Distribution 33 3.1.4 Inference Using Variational Autoencoder 34 3.2 Experiments 38 3.2.1 Practical Issues 38 3.2.2 Data. 39 3.2.3 Description 39 3.2.4 Measures 40 3.3 Results 41 3.3.1 Perplexity 41 3.3.2 Topic coherence 42 3.3.3 Continuous Embedding 44 3.4 Summary 47 4 Time-Dependent Continuous Topic Model 48 4.1 Methodology 48 4.1.1 Topic Over Time 48 4.1.2 Variational Auto-encoding Topic Over Time 51 4.1.3 Time-Dependent CSTEM 52 4.2 Experiments 54 4.3 Results 55 5 Conclusion 58 5.1 Summary 58 5.2 Future Works 59 Bibliography 60 A Some Appendices 67 A.1 Word-Topic Distributions 67 A.2 Calculation of KL-Divergence 71 Abstract (in Korean) 73 | - |
dc.format | application/pdf | - |
dc.format.extent | 4068742 bytes | - |
dc.format.medium | application/pdf | - |
dc.language.iso | en | - |
dc.publisher | 서울대학교 대학원 | - |
dc.subject | Topic model | - |
dc.subject | Natural Language Process | - |
dc.subject | Generative Model | - |
dc.subject | Variational inference | - |
dc.subject | Neural Network | - |
dc.subject.ddc | 510 | - |
dc.title | Generative Topic Model Using Variational Bayesian Inference | - |
dc.title.alternative | 변분 추론을 통한 주제 생성 모형에 관한 연구 | - |
dc.type | Thesis | - |
dc.contributor.AlternativeAuthor | Namkyu Jung | - |
dc.description.degree | Doctor | - |
dc.contributor.affiliation | 자연과학대학 수리과학부 | - |
dc.date.awarded | 2018-02 | - |
- Appears in Collections:
- Files in This Item:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.