Unsupervised Representation Learning for Homogeneous, Heterogeneous, and Tree-like Graphs

Abstract: 그래프 데이터에 대한 비지도 표현 학습의 목적은 그래프의 구조와 노드의 속성을 잘 반영하는 유용한 노드 단위 혹은 그래프 단위의 벡터 형태 표현을 학습하는 것이다. 최근, 그래프 데이터에 대해 강력한 표현 학습 능력을 갖춘 그래프 신경망을 활용한 비지도 그래프 표현 학습 모델의 설계가 주목을 받고 있다. 많은 방법들은 한 종류의 엣지와 한 종류의 노드가 존재하는 동종 그래프에 대한 학습에 집중을 한다. 하지만 이 세상에 수많은 종류의 관계가 존재하기 때문에, 그래프 또한 구조적, 의미론적 속성을 통해 다양한 종류로 분류할 수 있다. 그래서, 그래프로부터 유용한 표현을 학습하기 위해서는 비지도 학습 프레임워크는 입력 그래프의 특징을 제대로 고려해야만 한다. 본 학위논문에서 우리는 널리 접할 수 있는 세가지 그래프 구조인 동종 그래프, 트리 형태의 그래프, 그리고 이종 그래프에 대한 그래프 신경망을 활용하는 비지도 학습 모델들을 제안한다.

처음으로, 우리는 동종 그래프의 노드에 대하여 저차원 표현을 학습하는 그래프 컨볼루션 오토인코더 모델을 제안한다. 기존의 그래프 오토인코더는 구조의 전체가 학습이 불가능해서 제한적인 표현 학습 능력을 가질 수 있는 반면에, 제안하는 오토인코더는 노드의 피쳐를 복원하며,구조의 전체가 학습이 가능하다. 노드의 피쳐를 복원하기 위해서, 우리는 인코더 부분의 역할이 이웃한 노드끼리 유사한 표현을 가지게 하는 라플라시안 스무딩이라는 것에 주목하여 디코더 부분에서는 이웃 노드의 표현과 멀어지게 하는 라플라시안 샤프닝을 하도록 설계하였다. 또한 라플라시안 샤프닝을 그대로 적용하면 불안정성을 유발할 수 있기 때문에, 엣지의 가중치 값에 음의 값을 줄 수 있는 부호형 그래프를 활용하여 안정적인 라플라시안 샤프닝의 형태를 제안하였다. 동종 그래프에 대한 노드 클러스터링과 링크 예측 실험을 통하여 제안하는 방법이 안정적으로 우수한 성능을 보임을 확인하였다.

둘째로, 우리는 트리의 형태를 가지는 계층적인 관계를 가지고 있는 그래프의 노드 표현을 정확하게 학습하기 위하여 쌍곡선 공간에서 동작하는 오토인코더 모델을 제안한다. 유클리디언 공간은 트리를 사상하기에 부적절하다는 최근의 분석을 통하여, 쌍곡선 공간에서 그래프 신경망의 레이어를 활용하여 노드의 저차원 표현을 학습하게 된다. 이 때, 그래프 신경망이 쌍곡선 기하학에서 계층 정보를 담고 있는 거리의 값을 활용하여 노드의 이웃사이의 중요도를 활용하도록 설계하였다. 우리는 논문 인용 관계 네트워크, 계통도, 이미지 사이의 네트워크등에 대해 제안한 모델을 적용하여 노드 클러스터링과 링크 예측 실험을 하였으며, 트리의 형태를 가지는 그래프에 대해서 제안한 모델이 유클리디언 공간에서 수행하는 모델에 비해 향상된 성능을 보였다는 것을 확인하였다.

마지막으로, 우리는 여러 종류의 노드와 엣지를 가지는 이종그래프에 대한 대조 학습 모델을 제안한다. 우리는 기존의 방법들이 학습하기 이전에 충분한 도메인 지식을 사용하여 설계한 메타패스나 메타그래프에 의존한다는 단점과 많은 이종그래프의 엣지가 다른 노드 종류사이의 관계에 집중하고 있다는 점을 주목하였다. 이를 통해 우리는 사전과정이 필요없으며 다른 종류 사이의 관계에 더하여 같은 종류 사이의 관계도 동시에 효율적으로 학습하게 하는 메타노드라는 개념을 제안하였다. 또한 메타노드를 기반으로하는 그래프 신경망과 대조 학습 모델을 제안하였다. 우리는 제안한 모델을 메타패스를 사용하는 이종그래프 학습 모델과 노드 클러스터링 등의 실험 성능으로 비교해보았을 때, 비등하거나 높은 성능을 보였음을 확인하였다.
The goal of unsupervised graph representation learning is extracting useful node-wise or graph-wise vector representation that is aware of the intrinsic structures of the graph and its attributes. These days, designing methodology of unsupervised graph representation learning based on graph neural networks has growing attention due to their powerful representation ability. Many methods are focused on a homogeneous graph that is a network with a single type of node and a single type of edge. However, as many types of relationships exist in this world, graphs can also be classified into various types by structural and semantic properties. For this reason, to learn useful representations from graphs, the unsupervised learning framework must consider the characteristics of the input graph. In this dissertation, we focus on designing unsupervised learning models using graph neural networks for three graph structures that are widely available: homogeneous graphs, tree-like graphs, and heterogeneous graphs.

First, we propose a symmetric graph convolutional autoencoder which produces a low-dimensional latent representation from a homogeneous graph. In contrast to the existing graph autoencoders with asymmetric decoder parts, the proposed autoencoder has a newly designed decoder which builds a completely symmetric autoencoder form. For the reconstruction of node features, the decoder is designed based on Laplacian sharpening as the counterpart of Laplacian smoothing of the encoder, which allows utilizing the graph structure in the whole processes of the proposed autoencoder architecture. In order to prevent the numerical instability of the network caused by the Laplacian sharpening introduction, we further propose a new numerically stable form of the Laplacian sharpening by incorporating the signed graphs. The experimental results of clustering, link prediction and visualization tasks on homogeneous graphs strongly support that the proposed model is stable and outperforms various state-of-the-art algorithms.

Second, we analyze how unsupervised tasks can benefit from learned representations in hyperbolic space. To explore how well the hierarchical structure of unlabeled data can be represented in hyperbolic spaces, we design a novel hyperbolic message passing autoencoder whose overall auto-encoding is performed in hyperbolic space. The proposed model conducts auto-encoding the networks via fully utilizing hyperbolic geometry in message passing. Through extensive quantitative and qualitative analyses, we validate the properties and benefits of the unsupervised hyperbolic representations of tree-like graphs.

Third, we propose the novel concept of metanode for message passing to learn both heterogeneous and homogeneous relationships between any two nodes without meta-paths and meta-graphs. Unlike conventional methods, metanodes do not require a predetermined step to manipulate the given relations between different types to enrich relational information. Going one step further, we propose a metanode-based message passing layer and a contrastive learning model using the proposed layer. In our experiments, we show the competitive performance of the proposed metanode-based message passing method on node clustering and node classification tasks, when compared to state-of-the-art methods for message passing networks for heterogeneous graphs.

Language: eng

URI: https://hdl.handle.net/10371/187748

https://dcollection.snu.ac.kr/common/orgView/000000172199

Files in This Item:

000000172199.pdf 18.71 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Ph.D. / Sc.D._전기·정보공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share