교통카드 빅데이터 분석을 통한 서울시 통행 유동 클러스터 탐지 및 유형화

이승민

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

교통카드 빅데이터 분석을 통한 서울시 통행 유동 클러스터 탐지 및 유형화 : Travel Flow Cluster Detection and Characterization Based on the Analysis of Transport Smartcard Big Data in Seoul

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 이승민

Advisor: 이건학

Issue Date: 2020

Publisher: 서울대학교 대학원

Description: 학위논문(석사)--서울대학교 대학원 :사회과학대학 지리학과,2020. 2. 이건학.

Abstract: 본 연구는 위계적 군집 접근법에 기반한 유동 클러스터 추출 알고리즘인 유동 병합 클러스터링 알고리즘(Agglomerative Flow Clustering Algorithm)을 소개하고, 이를 실제 서울시 교통카드 이용내역 데이터에 적용하여 시간대별 서울시 통행 유동 패턴을 분석한다. 알고리즘은 두 유동 쌍의 공간적 인접지수를 계산한 다음, 인접지수가 높은 순서대로 두 유동 쌍이 속한 집합을 하나의 집합으로 묶어 나가는 과정의 반복으로 이루어져있다. 이를 통해 공간상에서는 인접하지만 서로 다른 교통수단과 노선 하에서 이루어진 유사한 각각의 개별 유동들이 하나의 클러스터 집합으로 군집화된다.
위의 알고리즘을 ㈜한국스마트카드를 통해 반출 받은 2018년 3월 12일(월)-3월 16일(금) 사이 발생한 실제 교통카드 이용 내역 데이터에 적용하여 서울시 대중교통 통행 유동 패턴을 도출하였다. 데이터에 대한 알고리즘 수행결과, 각 시간대별로 60,947(출근), 49,650(퇴근), 26,644(낮),17,175(야간)개의 유동 클러스터 집합이 추출되었다. 이들 중에서 클러스터의 총 통행량 Z점수가 2.58이상을 보이는 유의미한 클러스터를 각 시간대의 주요 통행 유동 패턴으로 선별하였으며 그 결과 659(출근), 507(퇴근), 278(낮), 229(야간)개의 주요 통행 유동 패턴을 탐지할 수 있었다. 이들은 각 시간대의 대중교통 이용 승객들의 이동을 약 8%에서 최대 22%까지를 대표하고 있다. 탐지된 결과에 대한 분석을 통해, 공간상에서는 인접하지만 서로 다른 교통수단과 노선 하에서 이루어진 유사한 각각의 개별 유동들을 하나의 유동 통행 패턴으로 탐지할 수 있다는 점과, 더 나아가, 일반적인 빈도분석으로는 뚜렷하게 나타나지 않지만 공간적으로 인접한 여러 다발의 유동이 모였을 때 유의미한 통행량을 가지게 되는 주요 통행 패턴을 포착할 수 있다는 점 두 가지의 방법론적 의의를 가진다.
다음으로, 주요 통행 유동 패턴 탐지 결과를 바탕으로 각 각 시간대별 주요 통행 유동 패턴들을 유형화한다. 서울시에서 제공하는 집계구별 생활인구 분포 데이터에 Getis-Ord Gi*통계를 적용하여 각 시간대별로 나타나는 핫스팟 지역들을 도출해내고, 이를 기반으로 유사한 출발과 도착지점을 보이는 유동 패턴들끼리 분류하였다. 이를 통해 각 시간대별로 핫스팟 지역과 연계되어 나타나는 유동의 공간 분포를 확인하였다. 더 나아가, 주요 유동 통행 패턴들의 공간 분포에 대한 시간대 간의 비교를 통해 각 지역별 근무지-거주지 분포에 대한 패턴을 해석하여 제시한다.
본 연구의 의의는 다음과 같다. 첫째, 유동의 효율적인 시각화를 위해 제시되었던 유동 병합 클러스터링 방법론을 유동 패턴 탐지를 위한 방법론으로 활용될 수 있도록 발전시켜 유동 클러스터링 분야의 방법론 연구에 기여한다. 개별 유동 통행량은 적지만 클러스터링 이후 유의미한 통행량을 가지게 되는 주요 통행 패턴을 포착할 수 있다는 방법론적 의의를 확인할 수 있다. 둘째, 서울시의 통행 패턴 연구에 유동(flow) 패턴 탐지 연구로써 기여한다. 지점(point)의 통행량 혹은 유입 유출량에 초점을 맞추고 있는 기존의 연구들과 달리 본 연구에서는 출발점과 도착점이 연계된 하나의 유동에 대한 통행량을 분석하고 있어 서울시에 발생하는 통행의 양상과 패턴에 대한 새로운 시각과 해석을 제공한다. 본 연구를 통한 분석 결과물들은 추후 교통 계획, 노선 선정, 도시구조 연구 등을 위한 자료로써 활용 가능하다. 4장의 유동 클러스터 유형화 결과를 이용해 근무지-거주지 분포 패턴 해석을 시도한 점은 도시구조 연구에의 실제적인 활용 가능성을 보여준다.
This study introduces the Agglomerative Flow Clustering Algorithm, which is a flow cluster extraction algorithm based on a hierarchical clustering approach and analyzes the travel flow pattern of public transportation in Seoul by analyzing the smartcard big data. The algorithm consists of calculating the spatial similarity indexes within a flow pair and grouping subsets of flow pairs into one set. It allows each similar individual flow in space but under different modes and routes of transportation to be grouped into a flow cluster set.
The algorithm was applied to smartcard usage data in Seoul, Korea, which occurred from March 12 (Mon) to March 16 (Fri), 2018. Flow cluster sets were extracted for each time slot of 60,947 (7:00-9:00), 49,650 (17:30-19:30), 26,644 (12:00-14:00), and 17,175 (22:30-0:30), respectively. Of these, significant clusters with a total Z-score of 2.58 or higher were selected as the main flow patterns for each time slot, resulting in travel flow patterns of 659 (7:00-9:00), 507 (17:30-19:30), 278 (12:00-14:00), and 229 (22:30-0:30), respectively. They represent approximately a minimum of 8% to a maximum of 22% of transit passengers in each time slot. The analysis shows that individual travel flows that are adjacent in space but under different modes and routes of transportation could be grouped to form a single travel flow cluster. Furthermore, the methodology of this study made it possible to capture the major travel flow patterns that have significant traffic volume when spatially adjacent multiple flows are gathered, although the traffic volume of each flow is small.
Next, the analyzed major travel flow patterns for each time slot were characterized based on the locations of the departure and arrival point. The Getis-Ord Gi* statistics were utilized to the living population data to derive hot spot regions appearing in each time slot. As a result, the spatial distribution characteristics of travel flows appearing under the hot spot areas for each time slot is revealed. Furthermore, the pattern of residence-workplace distribution in each region of Seoul is suggested by analyzing the spatial distribution of the major flow patterns within each time slot.
The significance of the study is as follows. First, the agglomerative flow clustering algorithm, which was proposed for efficient visualization of flows, is developed to be methodologically used for travel flow pattern detection for smartcard big data, contributing to the methodological research in the field of flow clustering. The methodological significance is identified by capturing the major travel flow patterns, which are small in individual traffic but have significant traffic volume after clustering. Second, it contributes to the study of Seouls travel flow pattern research. Compared to previous studies focusing on inflows and outflows of a locational point, this study analyzed the traffic volume of a flow which contains both a starting point and an arrival point simultaneously, thereby providing new perspectives and understandings of travel characteristics and patterns in Seoul. The analysis results of this study have implications for future research on transportation planning and urban structure research. In particular, the analysis of the patterns of residence-workplace distributions using the travel flow pattern characterization in Chapter 4 shows the practical applicability to the studies in terms of urban structure.

Language: kor

URI: http://dcollection.snu.ac.kr/common/orgView/000000159727

Files in This Item:

000000159727.pdf 9.42 MB

Appears in Collections:

College of Social Sciences (사회과학대학)
- Dept. of Geography (지리학과)
  - Theses (Master's Degree_지리학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share