TY - JOUR
T1 - Enhancing Recommendation Capabilities Using Multi-Head Attention-Based Federated Knowledge Distillation
AU - Wu, Aming
AU - Kwon, Young Woo
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - As the internet and mobile computing have advanced, recommendation algorithms are used to manage large amounts of data. However, traditional recommendation systems usually require collecting user data on a central server, which may expose user privacy. Furthermore, data and models from different organizations may be proprietary and cannot be shared directly, leading to data isolation. To address these challenges, we propose a method that combines federated learning (FL) with the recommendation system using a federated knowledge distillation algorithm based on a multi-head attention mechanism. In the proposed approach, knowledge distillation is introduced on the basis of FL to induce the training of the local network and facilitate knowledge transfer. Further, to address the non-independent identical distribution of training samples in FL, Wasserstein distance and regularization terms are incorporated into the objective function of federated knowledge distillation to reduce the distribution difference between server and client networks. A multi-head attention mechanism is used to enhance user encoding information. A combined adaptive learning rate is adopted to further improve the convergence. Compared to the benchmark model, this approach demonstrates significant improvements, with accuracy enhanced up to 10%, model training time shortened by approximately 45%, and average error and NDCG values decreased by 10%.
AB - As the internet and mobile computing have advanced, recommendation algorithms are used to manage large amounts of data. However, traditional recommendation systems usually require collecting user data on a central server, which may expose user privacy. Furthermore, data and models from different organizations may be proprietary and cannot be shared directly, leading to data isolation. To address these challenges, we propose a method that combines federated learning (FL) with the recommendation system using a federated knowledge distillation algorithm based on a multi-head attention mechanism. In the proposed approach, knowledge distillation is introduced on the basis of FL to induce the training of the local network and facilitate knowledge transfer. Further, to address the non-independent identical distribution of training samples in FL, Wasserstein distance and regularization terms are incorporated into the objective function of federated knowledge distillation to reduce the distribution difference between server and client networks. A multi-head attention mechanism is used to enhance user encoding information. A combined adaptive learning rate is adopted to further improve the convergence. Compared to the benchmark model, this approach demonstrates significant improvements, with accuracy enhanced up to 10%, model training time shortened by approximately 45%, and average error and NDCG values decreased by 10%.
KW - adaptive learning rate
KW - Federated learning
KW - multi-head attention
KW - Wasserstein distance
UR - http://www.scopus.com/inward/record.url?scp=85159695210&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3271678
DO - 10.1109/ACCESS.2023.3271678
M3 - Article
AN - SCOPUS:85159695210
SN - 2169-3536
VL - 11
SP - 45850
EP - 45861
JO - IEEE Access
JF - IEEE Access
ER -