Abstract
Accurate privacy-preserving prediction using electronic health record (EHR) data distributed in multiple hospitals is essential to enable stakeholders related to healthcare services to obtain useful information without privacy leakage. In this paper, we propose harmonized embedding-based self-attentive encoder (HarmoSATE), which is a new method for privacy-preserving federated predictive analysis. We extract contextual embeddings of local institutions using Word2Vec, and then harmonize locally-trained embeddings using a neural network-based harmonization technique. The proposed method uses a deep representative encoder based on self-attention to learn complex and dynamic patterns inherent to harmonized embeddings of medical concepts. To evaluate our method, we implemented experiments using sequential medical codes collected from the Medical Information Mart for Intensive Care-III dataset in a distributed setting. It achieved a significant increase in average AUC, ranging from 3% to 8% depending on the experiments compared to baseline models, demonstrating superior prediction accuracy of a patient's diagnosis in the next admission. HarmoSATE can be a useful alternative to obtain accurate and practical results for various predictive tasks that use sensitive and distributed EHR data while preserving patients' privacy.
Original language | English |
---|---|
Article number | 120265 |
Journal | Information Sciences |
Volume | 662 |
DOIs | |
State | Published - Mar 2024 |
Keywords
- Contextual embedding
- Deep learning
- Harmonization
- Privacy-preserving
- Self-attention