Publications

Detailed Information

Regularizing Structural Equation Models via the Lasso : Generalizability and Reproducibility Issues : Regularizing Structural Equation Models via the Lasso : Generalizability and Reproducibility Issues

DC Field Value Language
dc.contributor.advisor김청택-
dc.contributor.author강인한-
dc.date.accessioned2017-07-19T12:23:08Z-
dc.date.available2017-07-19T12:23:08Z-
dc.date.issued2016-08-
dc.identifier.other000000136383-
dc.identifier.urihttps://hdl.handle.net/10371/134405-
dc.description학위논문 (석사)-- 서울대학교 대학원 : 심리학과 계량심리 전공, 2016. 8. 김청택.-
dc.description.abstractGeneralizability and Reproducibility of research have become one of the main topics in current psychology. Previous discussions on the issue have focused on the Experimental/Procedural aspect such as incentive structure for researchers, violation in conducting an experiment, selective reporting, etc.
However, sometimes statistical methods which are widely used in psychology have properties that undermine the generalizabilty of research results. The present thesis approaches the reproducibility problem based on this Analytical/Statistical aspect. For this purpose, we studied a method for improving the Structural Equation Modeling(SEM), one of dominant statistical models in psychology. The main focus of this study is implementing L1-regularization, or Lasso, to SEM. With this method, the result will enjoy less variability of estimation than the existing Maximum Likelihood method.
First of all, the present thesis discusses some indices including Overall Discrepancy(OD) and Mean Squared Error(MSE) as criteria which indicate the generalizability and reproducibility of analysis results. Bayesian Lasso SEM, one of the previous attempts, is also covered with some fundamental issues. Furthermore, an algorithm for regularizing SEM via the Lasso is derived and examined by several simulation studies.
The study is carried out using Factor Analysis Model and Structural Equation Modeling, while adding several misspecified parameters. The purpose of this approach is to test Lasso SEMs complete shrinkage ability, which is able to detect and remove unnecessary parameters from the original model so that the method yields the result close to the true population-generating process. It is also investigated whether Lasso can improve generalizability and reproducibility by observing and comparing OD and MSE. The simulation deals with various conditions including model error, sample sizes, and magnitudes
of covariance matrix, in order to examine in which condition Lasso SEM yields better results than the Maximum Likelihood Estimation.
The result reveals that Lasso SEM works well in various conditions
-
dc.description.abstractit improves generalizability indices, detects and removes misspecified parameters in the original model. However, the performance depends on the conditions, which implies that the Lasso SEM should be applied with careful scrutiny on characteristics of practical data. Especially, the model error, one of the component affecting the data-generating process, has turned out to be the most influential factor that hinders proper function of the Lasso SEM. We suggest modifying the optimization of Lasso SEM, which is currently rely upon the value of OD, or its cross-validation estimate. The improvement can be achieved by replacing criteria or objective function in the optimization procedure. This will minimize problems including those generated from the model error.
A correlation analysis shows that Sample Discrepancy, which is a criterion of the existing estimation method, and goodness of model fit indices widely used in SEM have considerably low correlations with OD. This outcome
implies the SEM result obtained by the original method may be hard to be generalized to other independent samples including the future data, and the phenomenon that researchers are interested in.
-
dc.description.tableofcontentsIntroduction 1
Reproducibility Issues in Psychological Researches 1
Analytical/Statistical Approach to Reproducibility Issues 3
Generalizability in Structural Equation Modeling 7
Thesis Organization 11

Chapter 1 Structural Equation Modeling 13
1.1 Introduction to SEM 13
1.1.1 Measurement Model Part 13
1.1.2 Structural Model Part 16
1.2 Estimation of SEM 20
1.3 Fit Indices for Model Evaluation 24
1.4 Reproducibility and Generalizability Issues in SEM 32

Chapter 2 Regularization 41
2.1 Bias, Variance and MSE 41
2.2 Shrinkage Estimation 45
2.3 Regularization 47
2.3.1 Ridge (Hoerl & Kennard, 1970a, b) 50
2.3.2 Lasso (Tibshirani, 1996) 52
2.3.3 Elastic Net (Zou & Hastie, 2005) 54
2.4 The Connection between Regularization and Bayesian Analysis 57
2.4.1 Bayesian Linear Regression Analysis 57
2.4.2 BLasso: Bayesian Lasso 59
2.5 Regularization and Structural Equation Modeling 64
2.6 Some optimization methods for Lasso 71
2.6.1 LARS Algorithm 71
2.6.2 MM-Algorithm 76

Chapter 3 Bayesian Structural Equation Modeling 81
3.1 Basic Approach 81
3.1.1 Bayesian Factor Analysis 82
3.1.2 Bayesian Structural Equation Modeling 84
3.2 Bayesian Regularization for SEM 86
3.2.1 Bayesian Lasso for Factor Analysis 86
3.2.2 Bayesian Lasso for Structural Equation Modeling 88
3.3 Limitation 90

Chapter 4 Implementing Lasso to Structural Equation Modeling 95
4.1 Likelihood Functions in SEM 96
4.1.1 Measurement Model Part 96
4.1.2 Structural Model Part 98
4.2 Double EM-algorithm for L1-Regularized SEM 103
4.2.1 E-step : Compute Conditional Expectations of Likelihood Functions 105
4.2.2 M-step : Minimizing the target function 110
4.2.3 Optimization Methods for M-step 113
4.3 Further Issues in fitting Lasso SEM 121
4.3.1 Rescaling Issue for the Measurement Model 121
4.3.2 A Standardization Issue in M-step 125
4.3.3 Tuning Methods for L1-Regularized SEM 129
4.4 Result Algorithm for Lasso SEM 132

Chapter 5 Simulation Study : Method 135
5.1 Purposes of Research 135
5.2 Generating Population 137
5.3 Research Models 141
5.4 Research Conditions 149
5.5 Indices in Simulation Study 159
5.6 Flow of Simulation 162

Chapter 6 Simulation Study : Result 165
6.1 Research 1: Factor Analysis Model 165
6.2 Research 2: Structural Equation Model 189
6.3 Research 3: Additional Analyses 207
6.3.1 DA and Bias Analysis 207
6.3.2 Correlation Analysis of Fit Indices 214

Chapter 7 Discussion 221

References 239

Appendix 249
Appendix A : Result Tables 249
A1. Result Tables for Lasso 249
A2. Result Tables for BLasso 262
Appendix B : Standardization of SEM 275
B1. Factor Analysis Model 275
B2. Structural Equation Model 276
Appendix C : Some Derivations 277
C1. Derivation of Latent Variable Covariance Matrix 277
C2. Derivations for Some Posterior Distributions of BLasso SEM 278
Appendix D : R functions for Lasso SEM 282
Appendix E : Generating Population in SEM 290

초록 297
-
dc.formatapplication/pdf-
dc.format.extent10129806 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectReproducibility-
dc.subjectStructural Equation Modeling-
dc.subjectLasso-
dc.subject.ddc150-
dc.titleRegularizing Structural Equation Models via the Lasso : Generalizability and Reproducibility Issues-
dc.title.alternativeRegularizing Structural Equation Models via the Lasso : Generalizability and Reproducibility Issues-
dc.typeThesis-
dc.contributor.AlternativeAuthorInhan Kang-
dc.description.degreeMaster-
dc.citation.pagesxii, 298-
dc.contributor.affiliation사회과학대학 심리학과-
dc.date.awarded2016-08-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share