근적외선 분광분석법과 인공신경망에 의한 목재의 수종 구분

양상윤

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

근적외선 분광분석법과 인공신경망에 의한 목재의 수종 구분 : Classification of Wood Species using Near-infrared Spectroscopy and Artificial Neural Networks

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 양상윤

Advisor: 여환명

Major: 농업생명과학대학 산림과학부(환경재료과학전공)

Issue Date: 2019-02

Publisher: 서울대학교 대학원

Description: 학위논문 (박사)-- 서울대학교 대학원 : 농업생명과학대학 산림과학부(환경재료과학전공), 2019. 2. 여환명.

Abstract: Traditional lumber species identification methods using microscopes identify the species of wood based on anatomical information such as the color, surface features, and anatomic structures. Therefore, highly-trained wood anatomists are essential for anatomical analysis. A DNA analysis is hard to identify the species of lumber due to a difficulty of extracting the nucleus of wood cells, and costs a lot of money for the analysis. In this study, lumber species classification methods based on near-infrared (NIR) spectroscopy and artificial neural networks were developed for the simple and rapid lumber species classification.

Larch (Larix kaempferi), red pine (Pinus densiflora), Korean pine (Pinus koraiensis), cedar (Cryptomeria japonica) and cypress (Chamaecyparis obtusa) were employed for the study. These five species accounted for the majority of the log supplied to the domestic lumber production industry. The NIR spectra were acquired from the five species lumber samples, then several algorithms were used for classification.

Principal component analysis of the NIR spectra and soft independent modeling of class analogy (SIMCA) were applied for the species classification. As a result of principal component analysis based on three types of mathematical preprocessing (Raw, standard normal variate (SNV) and Savitzky-Golay 2nd derivatives (SG 2nd)), it was impossible to classify the species because PC1-PC2 scores were superposed. The SIMCA model based on SG 2nd preprocessing was determined as the best classification result among SIMCA classification models. The accuracy, minimum precision, and minimum recall of the best model were evaluated as 73.00%, 98.54%, and 67.50%, respectively.

Partial least squares discriminant analysis (PLS-DA) is a multivariate linear regression method. PLS-DA has a dummy dependent variables (1 or 0) depending on the group. There was a difference in the species classification reliability according to the three types of mathematical preprocessing (Raw, SNV, SG 2nd). The PLS-DA model based on SNV preprocessed NIR spectra showed lower classification reliability than that of raw spectra. Thus, the SNV preprocessing affected negatively on the PLS-DA model. The PLS-DA model based on SG 2nd preprocessing was determined as the best PLS-DA classification model. The accuracy, minimum precision, and minimum recall of the best model were evaluated as 74.90%, 100.00%, and 60.00%, respectively.

PLS-DA calculates a classification probability with distributions of the model prediction values when their distributions are considered as Gaussian probability distribution. PLS-DA models based on raw spectra and SNV preprocessed NIR spectra rarely improved the classification reliability after converting the predicted value to probability. However, PLS-DA model based on SG 2nd preprocessed NIR spectra highly improved (Accuracy = 95.18%, minimum precision = 99.19%, minimum recall = 91.50%). Variable importance in projection analysis was performed to analyze the NIR band that affected the improvement of the PLS-DA model based on SG 2nd preprocessed NIR spectra. 1698 nm which is the light absorbing region of cellulose, 1698 nm which is the light absorbing region of lignin, 1720 nm which is the light absorbing region of lignin and hemicellulose, and 1830 nm and 2304 nm which are not revealed as the light absorbing region by the main component of wood, And it was evaluated as contributing to improvement of species classification performance. There were three distinct peaks including 1632 nm (assigned to the cellulose), 1698 nm (assigned to the lignin), and 1720 nm (assigned to the hemicellulose and lignin) and two not-assigned peaks including 1895 nm and 2304 nm. These peaks were positively affected to the PLS-DA model after SG 2nd preprocessing.

Artificial neural network (ANN) that searches optimum weights for classification from the spectra and 1D convolutional neural network (1D CNN) that searches optimum filters for classification from the input spectra was performed for lumber species classification using the NIR spectra.

ANN has three different layers (Input layer : 1721 nodes, hidden layer : 64 nodes, output layer : 5 nodes) was designed. The classification reliability of ANN models was similar or higher than that of the best classification of PLS-DA based on probabilistic discrimination. Especially, accuracy, precision and recall of ANN model based on SG 2nd preprocessed NIR spectra was evaluated as 100% each. Also, the classification reliability was improved by mathematical preprocessing in ANN.

In order to reduce dependence on the mathematical preprocessing used in NIR spectroscopy, 1D CNN which finds the optimal mathematical preprocessing. 1D CNN architecture developed in this study has the four 1D convolution layers with different filter sizes and channels. They are arranged to perform preprocessing including spectral filtering, separation, and synthesis. As a result of evaluating classification reliability of 1D CNN model based on raw spectra, the accuracy, minimum precision and minimum recall were 99.90%, 99.50%, and 99.50%, respectively. The reliability of 1D CNN model based on SNV preprocessed spectra was the same as that based on raw spectra. The reliability index was all 100% for 1D CNN based on SG 2nd preprocessed spectra. Finally, it can be concluded that the most reliabile and simple species classification is possible when using neural network theory, especially 1D CNN among the species classification methods using near infrared spectra applied in this study.
현미경을 이용한 전통적인 조직학적 수종 식별 방법은 목재의 재색, 표면 특징, 해부학적 구조 등을 명확하게 구분함으로써 수종 식별을 실시한다. 따라서, 해부학적 수종 구분을 위해서는 충분히 숙련된 목재 해부학 전문가가 필요하다. DNA 분석에 의한 수종 식별 방법은 목재의 세포 조직으로부터 세포핵 추출이 어려워 분석이 불가능한 경우가 존재하며, 분석에 많은 비용이 소요된다. 본 연구에서는 목재의 수종을 간편하고 신속하게 구분하기 위하여 근적외선 분광분석법과 인공신경망을 적용한 수종 구분 방법을 개발하였다. 국내 제재목 생산업의 침엽수 소비량 중 대다수를 차지하는 낙엽송, 소나무, 잣나무, 삼나무 및 편백의 수종을 구분하기 위해 제재목에서 근적외선 스펙트럼을 측정하여 다양한 알고리즘에 의해 수종 구분을 실시하였다.

근적외선 스펙트럼을 이용한 주성분 분석(Principal component analysis)과 이에 기반한 soft independent modelling of class analogy(SIMCA)를 실시하였다. 근적외선 스펙트럼의 3가지 수학적 전처리 조건(Raw, standard normal variate (SNV), Savitzky-Golay 2nd derivative (SG 2nd))에 따라 전체 스펙트럼을 이용한 주성분 분석을 실시한 결과, PC1과 PC2에 속하는 score의 중첩에 의해 주성분 분석을 이용한 수종 구분은 불가능하였다. 각 수종 집단별 주성분 분석에 의한 SIMCA 분류를 실시한 결과, 수학적 전처리 조건에 따른 수종 구분 신뢰도에 차이가 있었다. SG 2nd 전처리를 실시한 스펙트럼을 이용한 SIMCA 분류가 가장 높은 신뢰도를 나타내었으며, 정확도는 73.00%, 최소 정밀도는 98.54%, 최소 재현율은 67.50%로 평가되었다.

부분 최소 자승 판별 분석은 집단에 따라 1 또는 0이 되는 모조 종속 변수(dummy dependent variable)를 갖도록 하는 다중 선형 회귀 모델을 개발함으로써 분류를 실시하는 방법이다. 교차 검정 예측치 0.5를 기준으로 동일한 3가지 수학적 전처리 조건에 따른 다중 수종 판별을 실시한 결과, 수학적 전처리 조건에 따른 수종 구분 신뢰도에 차이가 나타났으며, SNV 전처리는 부분 최소 자승 판별에 부정적인 영향을 미치는 것으로 판단되었다. SG 2nd 전처리를 실시한 스펙트럼을 이용한 부분 최소 자승 판별 모델의 정확도는 74.9%, 최소 정밀도는 100%, 최소 재현율은 69%로 평가되어 가장 개선된 수종 구분 성능을 나타내었다.

각 수종별 부분 최소 자승 판별 분석에 의한 교차 검정 예측치의 분포를 정규분포로 변환하여 확률분포에 의한 수종 구분을 실시한 결과, 원 스펙트럼과 SNV 전처리를 실시한 스펙트럼을 이용한 판별 분석 모델은 성능 개선이 미미한 것으로 나타났으나, SG 2nd 전처리를 실시한 판별 분석 모델의 경우 성능이 대폭 개선되었다. 이 때의 정확도는 95.18%, 최소 정밀도는 99.19%, 최소 재현율은 91.5%로 평가되었다. SG 2차 미분을 실시한 판별 모델의 정확도 향상에 영향한 근적외선 파장대역을 탐색하기 위해 variable importance in projection (VIP) score를 분석한 결과, Cellulose의 흡광영역인 1632 nm, lignin의 흡광영역인 1698 nm, lignin 및 hemicellulose의 흡광영역인 1720 nm 등과, 목재의 주요 성분에 의한 흡광 영역으로 밝혀지지 않은 1895 nm 및 2304 nm 인근이 SG 2차 미분에 의해 강조되면서 수종 구분 성능 개선에 기여한 것으로 평가되었다.

입력 데이터로부터 예측 또는 분류를 수행하는 최적의 가중치를 탐색하는 인공신경망과 최적의 필터를 탐색하는 1차원 합성곱 신경망을 이용하여 근적외선 스펙트럼을 이용한 목재 수종 구분을 실시하였다.

1721-64-5 (입력층-은닉층-출력층) 노드를 갖는 인공신경망을 설계하여 검정세트를 이용한 목재의 수종 구분을 실시한 결과, 3가지 수학적 전처리 조건에서 모두 확률에 기반한 부분 최소 자승 판별 분석과 유사하거나 더 개선된 성능으로 수종 구분이 가능하였다. 특히 SG 2nd 전처리를 실시한 스펙트럼을 이용한 판별 모델의 정확도, 정밀도, 재현율은 모두 100%로 평가되었다. 인공신경망을 이용한 수종 구분 시 수학적 전처리에 따라 분류성능이 개선되는 것을 확인할 수 있었으며, 특히 SG 2nd 전처리를 실시하는 경우 국산 침엽수 5수종의 제재목에서 측정한 근적외선 스펙트럼을 이용하여 충분한 수종 구분이 가능할 것으로 판단되었다.

근적외선 분광분석법에서 사용되는 수학적 전처리에 대한 의존도를 낮추기 위하여 신경망이 직접 수종 구분에 필요한 최적의 전처리 방법을 찾아나가는 1차원 합성곱 신경망을 활용한 수종 구분을 실시하였다. 본 연구에서는 각기 다른 필터 크기와 채널을 갖는 1차원 합성곱층을 4개 배치하여 스펙트럼의 필터링, 분리, 합성에 의한 전처리를 실시하도록 설계하였다. 1차원 합성곱 신경망을 이용하여 검정세트에 의한 수종 구분 정확도를 평가한 결과, 원 스펙트럼을 이용하여 학습한 1차원 합성곱 신경망의 수종 구분 정확도는 99.9%, 최소 정밀도 및 재현율은 99.5%로 나타났다. SNV를 실시한 스펙트럼을 이용하여 학습을 실시한 1차원 합성곱 신경망 또한 동일한 신뢰도로 평가되었다. SG 2nd 전처리를 실시한 스펙트럼을 이용하여 학습을 실시한 1차원 합성곱 신경망은 검정세트의 정확도, 정밀도, 재현율이 모두 100%로 나타났다. 최종적으로, 본 연구에서 적용한 근적외선 스펙트럼을 이용한 수종 구분 방법 중 신경망 이론, 특히 1차원 합성곱 신경망을 활용하는 경우 가장 정확한 수종 구분이 가능한 것으로 판단된다.

Language: kor

URI: https://hdl.handle.net/10371/152178

Files in This Item:

000000155734.pdf 4.28 MB

Appears in Collections:

College of Agriculture and Life Sciences (농업생명과학대학)
- Dept. of Forest Sciences (산림과학부)
  - Theses (Ph.D. / Sc.D._산림과학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share