HarmoSATE: Harmonized embedding-based self-attentive encoder to improve accuracy of privacy-preserving federated predictive analysis

Taek Ho Lee, Suhyeon Kim, Junghye Lee, Chi Hyuck Jun

Research output: Contribution to journalArticlepeer-review

Abstract

Accurate privacy-preserving prediction using electronic health record (EHR) data distributed in multiple hospitals is essential to enable stakeholders related to healthcare services to obtain useful information without privacy leakage. In this paper, we propose harmonized embedding-based self-attentive encoder (HarmoSATE), which is a new method for privacy-preserving federated predictive analysis. We extract contextual embeddings of local institutions using Word2Vec, and then harmonize locally-trained embeddings using a neural network-based harmonization technique. The proposed method uses a deep representative encoder based on self-attention to learn complex and dynamic patterns inherent to harmonized embeddings of medical concepts. To evaluate our method, we implemented experiments using sequential medical codes collected from the Medical Information Mart for Intensive Care-III dataset in a distributed setting. It achieved a significant increase in average AUC, ranging from 3% to 8% depending on the experiments compared to baseline models, demonstrating superior prediction accuracy of a patient's diagnosis in the next admission. HarmoSATE can be a useful alternative to obtain accurate and practical results for various predictive tasks that use sensitive and distributed EHR data while preserving patients' privacy.

Original languageEnglish
Article number120265
JournalInformation Sciences
Volume662
DOIs
StatePublished - Mar 2024

Keywords

  • Contextual embedding
  • Deep learning
  • Harmonization
  • Privacy-preserving
  • Self-attention

Fingerprint

Dive into the research topics of 'HarmoSATE: Harmonized embedding-based self-attentive encoder to improve accuracy of privacy-preserving federated predictive analysis'. Together they form a unique fingerprint.

Cite this