Machine Learning on Encrypted Data and Homomorphic Comparison

김두형

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Machine Learning on Encrypted Data and Homomorphic Comparison : 암호화된 상태에서의 기계학습과 동형 비교연산

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 김두형

Advisor: 천정희

Issue Date: 2021-02

Publisher: 서울대학교 대학원

Keywords: Homomorphic Encryption ; Machine Learning ; Comparison ; Polynomial Approximation ; 동형암호 ; 기계학습 ; 비교연산 ; 다항식 근사

Description: 학위논문 (박사) -- 서울대학교 대학원 : 자연과학대학 수리과학부, 2021. 2. 천정희.

Abstract: 기계학습은 최근 분야불문 빅데이터 분석의 가장 보편적인 방법 중 하나로 인식되고 있지만, 기업이나 기관들이 실제 데이 터에 활용하기에는 데이터 프라이버시 문제가 큰 우려로 남아있다. 데이터 프라이버시 보호를 위해 여러 비암호학적 방법론들이 그 동안 활용되어 왔지만, 각 과정 속에서 불가피하게 일어나는 정보 손실은 결국 데이터 유용성을 현저하게 떨어뜨리는 단점을 가지고 있다.
동형암호는 복호화 과정없이 암호화된 상태로 연산을 지원하기 때문에 데이터 프리이버시와 유용성을 동시에 보존하고, 기계학습에 적용하기에 가장 적합한 암호학적 프리미티브 중 하나로 여겨지고 있다. 하지만 그 동안 타켓 연산함수의 깊이가 깊거나 다수의 비다항식 연산을 포함하는 경우 굉장히 연산이 오래걸리는 문제 때문에, 지금까지 개발된 동형기계학습 알고리즘의 범주는 굉장히 제한되어 있었다.
본 논문에서는 이러한 한계점을 극복하기 위한 두가지 방법론을 제시한다. 첫번째 방법론은 기존 기계학습 알고리즘의 비다항식 연산을 동형암호 친화적인 형태로 변형시키는 방법이다. 기계학습에서 가장 널리 사용되는 기본적인 방법인 로지스틱 회귀분석에 이 방법론을 기반으로 한 연구를 진행하였고, 전장유전체연관분석 등 실제 많이 활용되는 응용분야에 적용하여 실효성을 입증하였다.
두번째 방법론은 기존 기계학습 알고리즘 자체를 변형하기보다 그 내부의 비다항식 연산에 대한 효율적인 다항식 근사 방법을 찾는 것이다. 본 논문에서는 가장 널리 사용되는 비다항식 연산들 중 하나인 비교연산과 최대/최소연산을 타겟으로 연구를 진행하였고, 합성함수 근사법을 통해 이 연산들에 대한 최적 계산량을 가지는 동형암호 알고리즘 개발에 성공하였다.
As machine learning (ML) has become a universal tool of big data analysis regardless of field, data privacy has emerged one of the most significant issues to be solved for applying ML to real-world applications. Some non-cryptographic methodologies have been applied so far for privacy preservation, but the loss of information is inevitable which leads to significant reduction in data usability.
Homomorphic Encryption (HE) has been recognized one of the most appropriate cryptographic primitives for privacy-preserving ML preserving both data privacy and usability, from its beautiful functionality that allows computation over encrypted data without decryption. However, extremely high computational cost of HE computation originated from the large depth of target functions or a number of non-polynomial operations
has remained a main bottleneck of applying HE in privacy-preserving ML.
In this thesis, we introduce two main methodologies to overcome this limitation. The first one is to modify the existing ML algorithms into HE-friendly form. We instantiate this methodology to logistic regression,
which is one of the most popular methods for classification, and show the practicality by applying our method to real-world applications including genome-wide association study (GWAS).
The second one is to find efficient polynomial approximation of non-polynomial functions, instead of substituting them with some other HE-friendly operations. Based on composite function approximation methods, we develop complexity-optimal HE algorithms for comparison and min/max functions, which are the most frequently used non-polynomial operations in real-world computation. We also show the practicality of our algorithms by implementing them based on an approximate HE scheme HEaaN: The homomorphic comparison of two encrypted 16-bit integers takes only 1.22 milliseconds in amortized running time, which is 18 faster than the previous best result.

Language: eng

URI: https://hdl.handle.net/10371/176032

https://dcollection.snu.ac.kr/common/orgView/000000163766

Files in This Item:

000000163766.pdf 6.55 MB

Appears in Collections:

College of Natural Sciences (자연과학대학)
- Dept. of Mathematical Sciences (수리과학부)
  - Theses (Ph.D. / Sc.D._수리과학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share