Positive-Unlabeled node classification under sparse labels with Mixup based GNNs

Abstract: Recently, semi-supervised learning has gained substantial interest due to the sparsity of real-world datasets. This is general to graph-structured data, where few labeled nodes are available during training. In this paper, we integrate Positive-Unlabeled (PU) learning with Graph Neural Networks (GNNs) to address binary node classification utilizing plentiful unlabeled nodes. Specifically, PU learning aims to excavate potential positive and negative interactions between nodes by using only positive labeled nodes and unlabeled nodes. Here, we propose a novel framework named Positive-Unlabeled node classification with Mixup-based GNNs (PUM-GNN). It addresses limited labeled cases and gives supervision to the PU learning using Mixup regularization. Mixup is a promising study in image data augmentation but has not been studied much in GNNs because of the irregularity of the graph. We use Mixup in the embedding space to not only augment data but also transform the marginal pseudo-negative instances into partially positive augmented instances, and improve the imprecise supervision within unlabeled instances. We conduct experiments using various positive label ratios and found that PUM-GNN not only reduces over-fitting but also outperforms state-of-the-art methods under sparse labels.
최근, 준지도학습은 실제 데이터 세트의 희소성으로 인해 상당한 관심을 얻고 있다. 훈련 중에 라벨이 지정된 노드가 부족한 그래프 데이터에서 데이터 희소성은 일반적인 문제이다. 본 논문에서는 양성-비라벨 (PU) 학습 방법을 그래프 신경망에 적용하여, 라벨이 지정되지 않은 많은 양의 노드를 학습에 활용하는 이진 노드 분류를 다룬다. 특히, 양성-비라벨 학습 방법은 양성 노드와 라벨이 없는 노드만을 사용해서 노드 간의 잠재적인 긍정적 및 부정적 상호 작용을 발굴하는 것을 목표로 한다. 본 논문은 믹스업 (Mixup) 기반의 그래프 신경망 (PUM-GNN) 이라는 새로운 프레임워크를 제안한다. 이 방법은 라벨링된 데이터가 적은 사례를 다루고, 믹스업 정규화를 사용하여 양성-비라벨 학습을 지도한다. 믹스업은 이미지 데이터 분류에서 유망한 연구이지만, 그래프 신경망 분야에서는 그래프의 불규칙성으로 인해 많이 연구되지 않았다. 우리는 임베딩 공간에서 믹스업을 사용하여 데이터를 증강시킬 뿐만 아니라 주변 의사 음성 인스턴스를 부분 긍정 라벨을 갖는 새로운 인스턴스로 변환하고, 라벨이 지정되지 않은 인스턴스 내의 부정확한 지도를 개선한다. 우리는 긍정 라벨 비율을 다양하게 조절하며 실험을 수행했고, 제안 모델이 과적합을 줄일 뿐만 아니라 희소 라벨에서 최첨단 방법을 능가한다는 것을 확인했다.

Language: eng

URI: https://hdl.handle.net/10371/193349

https://dcollection.snu.ac.kr/common/orgView/000000176084

Files in This Item:

000000176084.pdf 0.82 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share