TY - JOUR
T1 - DeepWalk with Reinforcement Learning (DWRL) for node embedding
AU - Jeyaraj, Rathinaraja
AU - Balasubramaniam, Thirunavukarasu
AU - Balasubramaniam, Anandkumar
AU - Paul, Anand
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2024/6/1
Y1 - 2024/6/1
N2 - DeepWalk is used to convert nodes in an original graph into equivalent vectors in a latent space for performing various predictive tasks. To ensure second-order structural similarity between nodes in the original graph and their vectors in the latent space, dot products are applied to each pair of nodes explored on the random walk (RW) in the latent space. However. dot products for graphs with millions of nodes and billions of edges are computationally expensive. To minimize the computation time required for calculating the second-order structural similarity, DeepWalk with reinforcement learning (DWRL) is proposed herein. In DWRL, a level pointer for each node in the original graph is prepared. By identifying common nodes between each pair of nodes in the original graph, the number of computations in the dot product in the latent space is reduced, thereby ensuring second-order structural similarity. Additionally, repeated selection of the same node during RWs produces redundant samples for training. Therefore, the subsampling technique is used to choose the next node based on its degree, which improves the generalization of node representations in the latent space and increases accuracy. The proposed techniques are applied to popular datasets to perform multilabel classification and link prediction tasks, and their efficiency in reducing the computation time is verified. The proposed DWRL minimizes the computation time 47% for large graphs to build latent vectors and improves the average micro and macro F1 scores up to 12%. The link prediction performance also increases up to 20%.
AB - DeepWalk is used to convert nodes in an original graph into equivalent vectors in a latent space for performing various predictive tasks. To ensure second-order structural similarity between nodes in the original graph and their vectors in the latent space, dot products are applied to each pair of nodes explored on the random walk (RW) in the latent space. However. dot products for graphs with millions of nodes and billions of edges are computationally expensive. To minimize the computation time required for calculating the second-order structural similarity, DeepWalk with reinforcement learning (DWRL) is proposed herein. In DWRL, a level pointer for each node in the original graph is prepared. By identifying common nodes between each pair of nodes in the original graph, the number of computations in the dot product in the latent space is reduced, thereby ensuring second-order structural similarity. Additionally, repeated selection of the same node during RWs produces redundant samples for training. Therefore, the subsampling technique is used to choose the next node based on its degree, which improves the generalization of node representations in the latent space and increases accuracy. The proposed techniques are applied to popular datasets to perform multilabel classification and link prediction tasks, and their efficiency in reducing the computation time is verified. The proposed DWRL minimizes the computation time 47% for large graphs to build latent vectors and improves the average micro and macro F1 scores up to 12%. The link prediction performance also increases up to 20%.
KW - DeepWalk
KW - Link prediction
KW - Node classification
KW - Node embedding
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85183366514&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2023.122819
DO - 10.1016/j.eswa.2023.122819
M3 - Article
AN - SCOPUS:85183366514
SN - 0957-4174
VL - 243
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 122819
ER -