Enhancing Recommendation Capabilities Using Multi-Head Attention-Based Federated Knowledge Distillation

Aming Wu, Young Woo Kwon

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

As the internet and mobile computing have advanced, recommendation algorithms are used to manage large amounts of data. However, traditional recommendation systems usually require collecting user data on a central server, which may expose user privacy. Furthermore, data and models from different organizations may be proprietary and cannot be shared directly, leading to data isolation. To address these challenges, we propose a method that combines federated learning (FL) with the recommendation system using a federated knowledge distillation algorithm based on a multi-head attention mechanism. In the proposed approach, knowledge distillation is introduced on the basis of FL to induce the training of the local network and facilitate knowledge transfer. Further, to address the non-independent identical distribution of training samples in FL, Wasserstein distance and regularization terms are incorporated into the objective function of federated knowledge distillation to reduce the distribution difference between server and client networks. A multi-head attention mechanism is used to enhance user encoding information. A combined adaptive learning rate is adopted to further improve the convergence. Compared to the benchmark model, this approach demonstrates significant improvements, with accuracy enhanced up to 10%, model training time shortened by approximately 45%, and average error and NDCG values decreased by 10%.

Original languageEnglish
Pages (from-to)45850-45861
Number of pages12
JournalIEEE Access
Volume11
DOIs
StatePublished - 2023

Keywords

  • adaptive learning rate
  • Federated learning
  • multi-head attention
  • Wasserstein distance

Fingerprint

Dive into the research topics of 'Enhancing Recommendation Capabilities Using Multi-Head Attention-Based Federated Knowledge Distillation'. Together they form a unique fingerprint.

Cite this