베이지안 공간 상관 모형을 활용한 서울시 교통사고자료 분석

윤성진

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

베이지안 공간 상관 모형을 활용한 서울시 교통사고자료 분석 : Analyzing Seoul Crash Data Using Bayesian Spatial Correlation Models

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 윤성진

Advisor: 장수은

Issue Date: 2020

Publisher: 서울대학교 대학원

Description: 학위논문(석사)--서울대학교 대학원 :환경대학원 환경계획학과,2020. 2. 장수은.

Abstract: 공간 자료를 활용한 교통 사고발생의 추정은 공간효과(spatial effects)를 고려해야 더 정확한 추정이 가능하다고 알려져 있다. 이는 교통사고 데이터가 가산자료라는 특징과 사고모형을 일반선형모형으로 가정하면서 발생한 문제가 모형 추정오류의 원인일 수 있기 때문이다. 따라서 본 연구에서는 일반화 선형모형으로 가산자료 모형인 포아송, 음이항 모형을 제시하고 오차항의 자기 상관성 문제에 적합한 자기회귀 모형인 조건부 자기회귀 베이지안 모형을 제시하였다. 모형 분석은 2010년 서울시 행정동별 교통사고 데이터를 대상으로 수행하였다. 각 모형은 동일한 변수로 구성되었는데 총 사고건수를 종속변수로 설정하고 총 차량주행거리, 가구당 인구, 교차로개소수, 버스전용차로설치비율, 횡단보도예고표시설치율, 과속방지턱개소수, 인력단속건수상대적비중도를 독립변수로 고려하였다.
모형 분석 결과, 포아송, 음이항, 베이지안 모형 순으로 우도 결정계수(R2)와 MAD 지표 값이 개선되었고 이는 모형의 적합성 부문에서 베이지안 모형의 우수성을 보여준다고 해석할 수 있다. 이에 본 연구에서는 적합성이 우수한 베이지안 모형을 대상으로 하여 독립변수들의 탄력성을 제시함으로써 종속변수에 미치는 각각의 영향을 파악하였다.
또, 본 연구에서는 각 모형의 공간적 자기상관성도 검증하였다. 먼저 포아송, 음이항 모형에서는 전역적 공간상관성 지표인 Morans I가 유의하게 나타나, 잔차 사이에 상관관계가 있다고 해석되었다. 그러나 베이지안 모형에서는 잔차의 공간적 자기상관성인 Morans I가 유의하지 않게 도출되었다.
본 연구의 주요 분석결과는 다음과 같이 정리해볼 수 있다. 과산포 문제의 원인이 되는 미관측 이질성에 대한 처리가 가능한 음이항 모형은 포아송 모형에 비해 개선된 성능을 보이나, 음이항 모형도 잔차의 공간적 자기상관성 문제를 다루지 못하며 이는 모형의 성능을 약화시킬 수 있음을 확인하였다. 결과적으로 베이지안 모형만이 잔차의 공간적 자기상관성을 적절히 처리하였고 모형의 성능도 상대적으로 우수하였다. 탄력성 측면에서 사고발생과 양의 관계를 가지며 가장 탄력적인 변수는 교차로 개소수로 나타났고, 음의 관계가 있는 변수 중 가장 탄력적인 변수는 가구당 세대원 수인 것으로 나타났다.
본 연구를 통해 베이지안 모형의 조건부 자기회귀 사전분포 요소로 인한 공간 자기상관성 완화로 인해 기존 가산모형인 포아송, 음이항 모형보다 자료의 공간특성을 적절하게 보여주었다. 이를 통해 공간 자료를 활용한 교통사고 발생 분석 모형으로 비공간 모형은 적합하지 않은 것으로 판단된다. 그러나 본 연구의 결과만으로 베이지안 모형의 우수성을 입증하는 것에 한계가 있기에 앞으로도 공간 개념을 모형에 반영할 수 있는 후속 연구들이 반드시 필요할 것으로 보인다.

주요어 : 가산 모형, 공간 데이터, 공간 상관성, 베이지안 모형, 사고 모형, Morans I
학 번 : 2018-24281
It is well-known that the estimation of traffic accidents using spatial data could be more accurate when considering spatial effect. The estimation error might come from the characteristic of traffic accident data, which is count data and from assuming the accident model as a general linear model. Therefore, in this study, Poisson and negative binomial (NB) models are presented as generalized linear models to treat model assumption error, and conditional auto-regressive Bayesian models are also presented suitable for treating residuals auto-correlation issue. The modeling is conducted using a database of traffic crashes in Seoul, Korea, in 2010 at the dong-level as traffic analysis zone (TAZ). As a prerequisite, each model consists of the same variable sets, which dependent variable is the number of crashes and independent variables consist of vehicle kilometric traveled, average household size, the number of intersections, the ratio of bus-only lanes installation, and etc.
The results show that R2 and MAD are improved from Poisson to NB to Bayesian model. Therefore, the Bayesian model is superior compared to others in terms of goodness of fit. So, The study chooses the Bayesian model and presents the elasticity of each independent variable to show their impact.
In addition, the study presents the outcome of each models spatial autocorrelation. In terms of Poisson and NB model, Moran's I, a global indicator of spatial correlation, are significant, which means there is a correlation between the residuals. In the Bayesian model, On the other hand, Moran's I is not significant.
The main findings of this study can be summarized as follows. The performance of NB model seems better than Poisson model because NB model might handle the unobserved heterogeneity, which can cause the over-dispersion problem of Poisson model. However, NB model also cannot treat spatial autocorrelation problem of residuals, which is also a weakness of model performance. It turns out that Bayesian model is only one which treats residuals spatial autocorrelation issue. Moreover, Bayesian model has improved performance relatively. Last but not least, Among the Independent variables of the Bayesian models, The elasticity of it shows that the number of intersections has the highest elasticity of the positive impact on the dependent variable while the average household size has the highest elasticity of negative impact on the number of crashes.
Through this study, It seems that the spatial characteristics of the data were properly demonstrated in Bayesian model rather than Poisson and NB models thanks to Conditional Autoregressive (CAR) prior of Bayesian model. Hence, non-spatial count models such as Poisson and NB are not suitable for analysis of traffic accidents using spatial data. However, Its hard to argue the superiority of Bayesian model from the results of this study. Therefore, further studies that can reflect the concept of space on crash models are needed in the future.

keywords : Count model, Spatial data, Spatial Correlation, Bayesian model, Crash modeling, Morans I
Student Number : 2018-24281

Language: kor

URI: http://dcollection.snu.ac.kr/common/orgView/000000159889

Files in This Item:

000000159889.pdf 1.22 MB

Appears in Collections:

Graduate School of Environmental Studies (환경대학원)
- Dept. of Environmental Planning (환경계획학과)
  - Theses (Master's Degree_환경계획학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share