Publications

Detailed Information

Secure tumor classification by shallow neural network using homomorphic encryption

DC Field Value Language
dc.contributor.authorSeung wan Hong-
dc.contributor.authorJai Hyun Park-
dc.contributor.authorWon hee Cho-
dc.contributor.authorHyeong min Choe-
dc.contributor.authorJung Hee Cheon-
dc.date.accessioned2022-05-10T00:58:38Z-
dc.date.available2022-05-10T00:58:38Z-
dc.date.issued2022-04-09-
dc.identifier.citationBMC Genomics. Vol 23(1):284ko_KR
dc.identifier.issn1471-2164-
dc.identifier.urihttps://hdl.handle.net/10371/179640-
dc.description.abstractDisclosure of patients genetic information in the process of applying machine learning techniques for tumor classification hinders the privacy of personal information. Homomorphic Encryption (HE), which supports operations between encrypted data, can be used as one of the tools to perform such computation without information leakage, but it brings great challenges for directly applying general machine learning algorithms due to the limitations of operations supported by HE. In particular, non-polynomial activation functions, including softmax functions, are difficult to implement with HE and require a suitable approximation method to minimize the loss of accuracy. In the secure genome analysis competition called iDASH 2020, it is presented as a competition task that a multi-label tumor classification method that predicts the class of samples based on genetic information using HE.
We develop a secure multi-label tumor classification method using HE to ensure privacy during all the computations of the model inference process. Our solution is based on a 1-layer neural network with the softmax activation function model and uses the approximate HE scheme. We present an approximation method that enables softmax activation in the model using HE and a technique for efficiently encoding data to reduce computational costs. In addition, we propose a HE-friendly data filtering method to reduce the size of large-scale genetic data.
We aim to analyze the dataset from The Cancer Genome Atlas (TCGA) dataset, which consists of 3,622 samples from 11 types of cancers, genetic features from 25,128 genes. Our preprocessing method reduces the number of genes to 4,096 or less and achieves a microAUC value of 0.9882 (85% accuracy) with a 1-layer shallow neural network. Using our model, we successfully compute the tumor classification inference steps on the encrypted test data in 3.75 minutes. As a result of exceptionally high microAUC values, our solution was awarded co-first place in iDASH 2020 Track 1: Secure multi-label Tumor classification using Homomorphic Encryption.
Our solution is the first result of implementing a neural network model with softmax activation using HE. Also, HE optimization methods presented in this work enable machine learning implementation using HE or other challenging HE applications.
ko_KR
dc.description.sponsorshipThis work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2020-0-00840, Development and Library Implementation of Fully Homomorphic ML Algorithms supporting Neural Network Learning over Encrypted Data).ko_KR
dc.language.isoenko_KR
dc.publisherBMCko_KR
dc.subjectHomomorphic encryption-
dc.subjectMulti-label classification-
dc.subjectPrivacy-
dc.subjectNeural network-
dc.subjectSoftmax activation-
dc.titleSecure tumor classification by shallow neural network using homomorphic encryptionko_KR
dc.typeArticleko_KR
dc.identifier.doihttps://doi.org/10.1186/s12864-022-08469-wko_KR
dc.citation.journaltitleBMC Genomicsko_KR
dc.language.rfc3066en-
dc.rights.holderThe Author(s)-
dc.date.updated2022-04-10T03:15:24Z-
dc.citation.number1ko_KR
dc.citation.startpage284ko_KR
dc.citation.volume23ko_KR
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share