S-Space College of Engineering/Engineering Practice School (공과대학/대학원) Dept. of Industrial Engineering (산업공학과) Theses (Ph.D. / Sc.D._산업공학과)
Prediction of stock price, base rate, and interest rate spread with text data
텍스트 데이터를 이용한 주식 가격, 기준 금리 및 스프레드 예측
- 공과대학 산업공학과
- Issue Date
- 서울대학교 대학원
- Data mining; Machine learning; Word embedding; Distributed representation; Bag-of-words; Sentiment analysis; Stock price predition; Vote result prediction of monetary policy committee; Bond spread prediction; Corporate disclosures; Monetary policy documents; Economic indicators
- 학위논문 (박사)-- 서울대학교 대학원 : 공과대학 산업공학과, 2018. 2. 조성준.
- Methodologies in financial research based on a variety of predictions models have been actively developed for the analysis of market behaviors. The significance of prediction modeling in the financial market cannot be emphasized better especially given that it leads directly to large transaction profit. In terms of applicability for the active agents in the market requires, these research results require both predictability and interpretability. In this study, we propose methodologies suitable for incorporating distinct characteristics across different financial data in the analysis for the purpose of effective prediction modeling. Firstly, we propose a methodology that quantitatively and qualitatively predicts the stock price movements through sentiment analysis of corporate disclosures in the stock market. The proposed method predicts stock price movements by embedding the documents, and the class of documents defined to fit the purpose of our study, to the same projection space based on the distributed representations learned, and compares the predictive performance against various existing models. The results provide prime evidence of effectiveness of our prediction results through visualization of document sentiments. In addition, we propose a methodology specifically designed for predicting the vote results of the base interest rate, which is the most important factor in the bond market, developed within the premise of the Korean bond market. Our methodology allows computation of sentence sentiments using the monetary policy decision recorded as text data, which is released before the announcement of the vote result, which are then aggregated to the document level to express the document sentiment of monetary policy decision into values. Using these sentiments, we predict the vote results of the base rate. Finally, we define the framework for predicting the spread, the difference between two bond rates with different maturities. The framework mainly considers the following three aspects as the standards for the effectiveness of research: interpretability, proper prediction metrics, and the reporting methods. The framework use wrapper approaches for the practical interpretation of important variables, while using PARE, in combination with MAE, as prediction metrics, for taking into account the tolerance of the spread. Lately, we suggest various visualizations and hierarchical illustration of significant variables as more applicable and effective reporting methods. This dissertation defines a variety of financial problems, proposes analytical methodologies, compares quantitative prediction power, and provide the qualitative evidence. The proposed methodologies prove to serve as a quick and accurate data-driven decision making support tool to active agents in the real-site.