Data-driven Fault Detection and Diagnosis Using Machine Learning Techniques and Information Theory

Abstract: 공정 모니터링 시스템은 효과적이고 안전한 공정 운전을 위한 필수적인 요소이다. 공정 이상은 목표 생성물의 품질에 영향을 주거나 공정의 정상 가동을 방해하여 생산성을 저해할 수 있다. 폭발성 및 인화성 물질을 주로 다루는 화학공정의 경우 공정 이상은 가장 중요한 요소인 공정의 안전을 위협하는 요소로 작용할 수 있다. 한편, 현대의 공정의 범위가 확장되고 자동화와 고도화가 진행됨에 따라 점점 더 신뢰도 높은 모니터링 시스템이 요구되고 있다.
공정 모니터링은 크게 세 단계로 구분될 수 있다. 실시간으로 공정의 이상 여부를 판단하는 공정 이상 감지, 다음으로 감지된 이상의 원인을 파악하는 이상 진단, 마지막으로 공정 이상의 원인을 제거하고 정상 상태로 회복시키는 복원으로 나뉘어진다. 특히 공정 이상 감지와 진단 시스템을 위해 다양한 방법론들이 제안되어왔으며, 그 방법론들은 크게 세 가지로 구분할 수 있다. 물리 이론을 기반으로 한 모델 분석 방법과 특정 분야의 경험 지식을 바탕으로 한 지식 기반 방법론에 비해 범용적인 적용 가능성과 현대 공정의 풍부한 공정 데이터가 제공되는 조건의 충족으로 인해 데이터 기반 방법론이 널리 활용되어지고 있다. 또한, 데이터 기반 공정 모니터링 방법론들은 공정의 규모와 복잡도가 증가함에 따라 그 장점이 더욱 극대화되는 특징을 갖는다. 본 연구에서는 기존의 데이터 기반 공정 모니터링 방법론들의 성능을 개선하기 위한 공정 이상 감지 방법론과 이상 진단 방법론을 제안한다.
전통적인 공정 이상 감지 시스템은 차원 축소방법들을 기반으로 개발되었다. 차원 축소를 기반으로 한 공정 이상 감지 모델은 공정 데이터에 내재되어 있는 특징으로 정의되는 저차원의 잠재 공간을 정의하고, 이를 기준으로 모니터링을 수행한다. 대표적인 방법으로는 전통적인 다변량 공정 모니터링 방법인 주 성분 분석과 머신 러닝 기법인 오토인코더가 있다. 최근 풍부한 학습 데이터와 우수한 성능 덕분에 다양한 머신 러닝 기법을 사용한 이상 감지 시스템이 널리 활용되고 있지만, 앞서 소개한 현대 공정의 다양한 특징으로 인해 더욱 향상된 성능의 모니터링 기법의 개발이 요구되어지고 있다. 이러한 데이터 기반 모니터링 시스템의 성능 향상을 위해서 모델의 구조를 변경하거나 모델의 학습 절차를 변형하는 접근법들이 주로 제안되었다. 하지만, 데이터 기반 방법론들은 궁극적으로 학습 데이터의 품질에 의존적이라는 특성은 여전히 남아있다. 즉, 학습 데이터의 부족한 정보를 보완함으로써 모니터링 시스템의 완성도를 높일 수 있는 방법론이 요구된다. 따라서, 본 연구는 첫 번째 주제로 데이터 증강 기법을 결합한 공정 이상 감지 방법론을 제안한다.
데이터 증강 기법은 여러 집합을 구분하는 분류기 모델링시에 특정 집합의 학습 데이터가 부족한 경우에 주로 활용되었다. 이러한 경우 데이터 증강을 통해 학습 데이터의 균형을 맞춤으로써 모델의 학습 효율을 증진시킬 수 있다. 반면에, 본 연구에서의 데이터 증강은 한 집합 내에서의 불균형을 완화하기 위한 목적으로 사용되었다. 정상 조건의 공정 데이터는 정상과 이상의 경계에 분포하는 데이터가 희박하게 존재하는 특징을 갖는다. 이상 감지 시스템이 정상 상태의 저차원 특징 공간을 학습하고, 이를 통해 정상과 이상을 구분하는 모델이라는 점을 고려하면 경계 영역의 데이터의 증강이 특징 공간 학습에 긍정적으로 작용할 것을 기대해 볼 수 있다. 이와 같은 맥락에서 제안된 방법론은 다음과 같다.
먼저, 기존의 학습 데이터를 이용하여 인공 데이터를 생성하기위한 생성모델인 변분 오토인코더를 학습한다. 생성 모델로 학습한 정상 운전 데이터의 저차원 분포의 경계영역에 해당하는 데이터들을 인공 데이터로 생성하여 학습데이터에 증강시킨다. 이렇게 증강된 학습 데이터를 기반으로 이상 감지 모델을 위한 머신 러닝 기반 차원 축소 방법인 오토인코더를 학습하여 이상 감지 시스템을 구축한다. 증강된 학습 데이터를 사용함으로써 오토인코더의 잠재 공간 학습이 더 효과적으로 수행될 수 있고, 이는 곧 정상과 이상 상태를 구분하는 이상 감지 시스템의 성능 개선으로 이어질 수 있다.
차원 축소 기법은 전통적인 이상 진단 방법으로도 활용되었다. 하지만, 이는 차원 축소시의 정보의 손실로 인해 저조하고 일관성이 부족한 성능을 보였다. 전통적인 방법의 한계점을 개선하기 위해 공정 변수 간의 인과 관계를 직접적으로 분석하는 기법들이 개발되었다. 그 중 하나인 정보 이론 기반의 전달 엔트로피는 특정 모델이나 선형 가정을 기반으로 하지 않기 때문에 비선형 공정의 이상 진단에 대해 일반적으로 우수한 성능을 보인다고 알려져 있다. 하지만, 전달 엔트로피를 이용한 인과관계 분석 방법은 고비용의 밀도 추정을 필요로 한다는 단점으로 인해 소규모 공정에 대해서만 제한적으로 적용되어 왔다. 이러한 한계점을 개선하기 위한 방안으로 그래프 라쏘라는 조정 방법을 전달 엔트로피와 결합한 방법론을 제안하였다.
그래프 라쏘는 비 방향성 그래프 모델에서 성긴 구조를 학습하기 위한 방법론으로 전체 공정 그래프로부터 상관 관계가 높은 부분 그래프를 추출해낼 수 있다. 가장 높은 상관 관계를 갖는 부분 그래프와 독립된 나머지 변수들이 그래프 라쏘의 출력으로 제시되기 때문에, 나머지 변수들에 대한 반복적인 적용을 통해 전체 공정 변수들을 연관성이 높은 몇몇의 부분 그래프로 변환할 수 있다. 연관성이 낮은 관계를 사전에 배제함으로써 인과 관계 분석의 대상을 크게 축소할 수 있다. 즉, 이 단계를 통해 고비용의 전달 엔트로피의 한계점을 완화하고, 그 적용 가능성을 확장할 수 있도록 한다.
두 방법을 결합하여 다음과 같은 이상 진단 방법론을 제안하였다. 먼저, 공정 이상이 발생한 데이터를 대상으로 반복적 그래프 라쏘를 적용하여 전체 공정 변수들을 연관성이 높은 5개의 부분 집합으로 구분한다. 구분된 각각의 부분 집합을 대상으로 전달 엔트로피를 이용한 인과관계 척도를 계산하고, 가장 유력한 원인 변수를 판별해낸다. 즉, 그래프 라쏘를 통해 효과적으로 인과관계 분석의 대상을 축소함으로써 불필요한 전달 엔트로피 계산으로 발생하는 비용을 크게 절감할 수 있다. 따라서, 제안된 방법론은 대규모 산업 공정에 대해서도 전달 엔트로피를 이용한 이상 진단 기법의 적용을 가능하게 했다는 점에서 의의가 있다.
본 연구에서 제안된 방법론의 성능을 검증하기 위하여 산업 규모의 벤치마크 공정 모델인 테네시 이스트만 공정에 이를 적용하고 결과를 분석하였다. 벤치마크 공정 모델은 다수의 단위 공정을 포함하고, 재순환 흐름과 화학 반응을 포함하고 있어 실제 공정과 같은 복잡도를 갖는 공정 모델로서 제안한 방법론들의 성능을 시험해보기에 적합했다. 성능 테스트는 테네시 이스트만 공정 모델에 포함되어 있는 사전에 정의된 28개 종류의 공정 이상에 대하여 수행하였다. 제안한 데이터 증강을 접목한 공정 이상 감지 방법론은 기존 방법론 대비 높은 이상 감지율을 보였다. 일부의 경우 이상 감지 지연측면에서도 개선을 확인할 수 있었다. 또한, 이상 진단을 위해 전달 엔트로피와 그래프 라쏘를 결합한 제안한 방법론은 전체 공정에 전달 엔트로피를 직접 적용한 기존의 방법론 대비 약 20%의 계산 비용만으로도 효과적으로 이상의 원인을 파악해내는 것을 확인할 수 있었다. 또한, 성능 테스트 결과는 일부 공정 이상의 경우 제안한 방법론이 기존의 방법보다 더 정확한 이상 진단 결과를 제시할 수 있음을 보였다.
Process monitoring system is an essential component for efficient and safe operation. Process faults can affect the quality of the product or interfere with the normal operation of the process, hindering productivity. In the case of chemical processes dealing with explosive and flammable materials, process fault can act as a threat to the process safety which should be the top priority. Meanwhile, modern processes demand a more advanced monitoring system as the scope of the process expands and the process automation and intensification progress.
The framework of the process monitoring system can be classified into three stages. It is divided into process fault detection that determines the existence of process faults in a system in real-time, fault diagnosis that identifies the root cause of the faults, and finally, process recovery that removes the cause of the fault and normalizes the process. In particular, various methodologies for fault detection and diagnosis have been proposed, and they can be categorized into three approaches. Data-driven methodologies are widely utilized due to the general applicability and the conditions under which abundant process data are provided compared to analytical methods based on the detailed first-principle models and knowledge-based methods on the specific domain knowledge. Furthermore, the advantage of the data-driven methods can be prominent as the scale and complexity of the process increase. In this thesis, fault detection and diagnosis methodologies to improve the performance of existing data-driven methods are proposed.
Conventional data-driven fault detection systems have been developed based on dimensionality reduction methods. The fault detection models using dimensionality reduction identify the low dimensional latent space defined by features inherent in process data, performing process monitoring based on it. As the representative methods, there are principal component analysis which is the conventional multivariate process monitoring approach, and autoencoder which is one of the machine learning techniques. Although the monitoring systems using various machine learning techniques have been widely utilized thanks to sufficient process data and good performance, a monitoring scheme that improves the performance of up-to-date methods is required due to the aforementioned factors. To improve the performance of such a data-driven monitoring system, approaches that change the structure of the model or learning procedure have been mainly discussed. Meanwhile, the nature that data-driven methods are ultimately dependent on the quality of the training dataset still remains. In other words, a methodology to enhance the completeness of the monitoring system by supplementing the insufficient information in the training dataset is required. Thus, a process fault detection method that combines data augmentation techniques is proposed in the first part of the thesis.
Data augmentation has been mostly employed to manage the deficiency of certain classes, between-class imbalance, in a classification problem. In this case, data augmentation can be effectively applied to improve the training performance by balancing the amount of each class. Data augmentation in this study, on the other hand, is applied to alleviate the with-in-class imbalance. The process data in normal operation has characteristics that the data samples in the borderline of normal and abnormal state are relatively sparse. Given that the modeling of the fault detection system corresponds to defining the low-dimensional feature space and monitoring the system in it, it can be expected that the supplement of the samples on the boundary of the normal state would positively affect the training process. In this context, the proposed method is as follows.
First, variational autoencoder which is a generative model is constructed to generate the synthetic data using the original training data. The sample vector corresponding to the boundary region of the low-dimensional distribution of the normal state learned by the generative model is generated as the synthetic data and augmented to the original training data. Based on the augmented training data the fault detection system is established using autoencoder, a machine learning algorithm for feature extraction. The feature learning of autoencoder can be performed more effectively by using the augmented training data, which can lead to the improvement of the fault detection system that distinguishes between normal and abnormal states.
The dimensionality reduction methods have been also utilized as the fault isolation method known as the contribution charts. However, the approaches showed limited performance and inconsistent analysis results due to the information loss during the dimension reduction process. To resolve the limitations of the conventional method, the approaches that directly figure out the causal relationships between process variables have been developed. As one of them, transfer entropy, an information-theoretic causality measure, is generally known to have good fault isolation performance in the fault isolation of nonlinear processes because it is neither linearity assumption nor model-based method. However, it has been limitedly applied to the small-scale process because of the drawback that the causal analysis using transfer entropy requires costly density estimation. To resolve the limitation, the method that combines graphical lasso which is a regularization method with transfer entropy is proposed.
Graphical lasso is a sparse structure learning algorithm of the undirected graph model, which can be used to sort out the most relevant sub-group in the entire graph model. As graphical lasso algorithm presents the output as a highly correlated subgroup with the rest of the variables, the iterative application of graphical lasso can substitute the entire process into several subgroups. This process can greatly reduce the subject of causal analysis by excluding relationships with little relevance in advance. Accordingly, the limitation of demanding cost of transfer entropy can be mitigated and thus the applicability of fault isolation using transfer entropy can be expanded through this process.
Combining the two methods, the following fault isolation method is proposed. First of all, the entire process variables are divided into the five most relevant subgroups based on the data when the fault has occurred. The root cause variable can be isolated from the most significant relationship by calculating the causality measure using transfer entropy only within each subgroup. It is possible to significantly reduce the computational cost due to transfer entropy by efficiently decreasing the subject of causal analysis through graphical lasso. Therefore, the proposed method is noteworthy in that it enables the application of fault isolation using transfer entropy for industrial-scale processes.
The proposed methodologies in each stage are verified by applying them to the industrial-scale benchmark process model, the Tennessee Eastman process (TEP). The benchmark process model is suitable to test the performance of the proposed methods because it is a process model with similar complexity as a real chemical process involving multiple unit operations, recycle stream, and chemical reactions in it. The performance test is performed with respect to the 28 predefined process faults scenarios in TEP model. Application results of the proposed fault detection method performed better than the case using the conventional approach in terms of the fault detection rate. In some fault cases, the fault detection delay, the time required to first detect a fault since it occurred, also showed improvement. Fault isolation results by the proposed method integrating transfer entropy with graphical lasso showed that it could effectively identify the cause of the process fault with only about 20% of the computational cost compared to the base case that directly applied the transfer entropy to the entire process for fault isolation. In addition, the demonstration results suggested that the proposed method could outperform the base case in terms of accuracy in some particular cases.

Language: eng

URI: https://hdl.handle.net/10371/177567

https://dcollection.snu.ac.kr/common/orgView/000000167161

Files in This Item:

000000167161.pdf 6.21 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Master's Degree_전기·정보공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share