Publications

Detailed Information

Multi-source, unstructured and external data analytics for manufacturing process : 제조 프로세스를 위한 다중 소스, 비정형 및 외부 데이터 애널리틱스

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

고태훈

Advisor
조성준
Major
공과대학 산업·조선공학부
Issue Date
2017-02
Publisher
서울대학교 대학원
Keywords
다중 소스 데이터비정형 데이터외부 데이터기계학습데이터 통합제조 프로세스신제품 개발품질 관리
Description
학위논문 (박사)-- 서울대학교 대학원 : 산업·조선공학부, 2017. 2. 조성준.
Abstract
Data integration means the task of combining data with various types residing at different sources, and providing the user with a unified view of these data. In this thesis, we consider the data integration as the process of creating data marts to be used as input to the machine learning and data mining models in a view of data analyzer and miners. Actually, three types of problems are encountered in the data integration process: How to integrate (1) data from various sources, (2) different types of data and (3) external data with internal data. To integrate these data, the enterprise must consider and solve some technical and manageral issues. To prove our concept, three real-world applications are introduced. Knowledge can be regarded as the most valuable asset of a manufacturing enterprise. Therefore, a manufacturer enterprise should collect the data representing its processes and environments and analyze the data to build a sustainable knowledge model.
First application is about generating user scenarions using online social media in the early steps of new product development (NPD) process. By strategic keyword searching, several novel user contexts are discovered from online social media. Based on contexts, domain experts can generate user scenarios for new features and functions of the target product.
Second application is to construct early engine fault detection models by integrating manufacturing, inspection and after-sales service data. In most cases, production data and after-sales service data are managed independent departments, even different companies. To detect engine faults which represent customer-perceived quality, data integration is the key to generate integrated data mart. In this application, In this study, one-class classification algorithms are used due to class-imbalance problem. To address multi-dimensionality of time series data, the symbolic aggregate approximation (SAX) algorithm is used for data segmentation. Then, binary genetic algo-rithm-based wrapper approach (BGA-wrapper) is applied to segmented data to find the optimal feature subset. As a result, an anomaly score for each engine is calculated. Experimental results show that the proposed method can detect defective engines with a high probability before they are shipped.
Final application is to discover knowledge from textual data in various sources. Despite many enterprises know the importance of managing key performance indicators (KPIs), most of quality activities are fulfilled according to analyzing attribute or quantitative value. It has the limitation to understand customers perspectives and exact defects. In this application, a novel active learning framework for dictionary expansion is introduced. In this framework, unsupervised natural language processing methods suitable for Koreans are applied to the data. As a result, proposed framework can construct domain-specific dictionary from almost zero-based one.
Language
English
URI
https://hdl.handle.net/10371/118295
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share