Detailed Information

Bilingual autoencoder-based efficient harmonization of multi-source private data for accurate predictive modeling

Cited 3 time in Web of Science Cited 3 time in Scopus

Lee, Taek-Ho; Lee, Junghye; Jun, Chi-Hyuck

Issue Date
INFORMATION SCIENCES, Vol.568, pp.403-426
Sharing electronic health record data is essential for advanced analysis, but may put sensitive information at risk. Several studies have attempted to address this risk using contextual embedding, but with many hospitals involved, they are often inefficient and inflexible. Thus, we propose a bilingual autoencoder-based model to harmonize local embeddings in different spaces. Cross-hospital reconstruction of embeddings makes encoders map embeddings from hospitals to a shared space and align them spontaneously. We also suggest two-phase training to prevent distortion of embeddings during harmonization with hospitals that have biased information. In experiments, we used medical event sequences from the Medical Information Mart for Intensive Care-III dataset and simulated the situation of multiple hospitals. For evaluation, we measured the alignment of events from different hospitals and the prediction accuracy of a patient's diagnosis in the next admission in three scenarios in which local embeddings do not work. The proposed method efficiently harmonizes embeddings in different spaces, increases prediction accuracy, and gives flexibility to include new hospitals, so is superior to previous methods in most cases. It will be useful in predictive tasks to utilize distributed data while preserving private information. (c) 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BYNC-ND license (
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • Graduate School of Engineering Practice
  • Department of Engineering Practice
Research Area Deep Learning, Machine Learning, Privacy-preserving Federated Learning, Smart Healthcare


Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.