TY - GEN
T1 - Adaptive Natural Gradient Method for Learning Neural Networks with Large Data set in Mini-Batch Mode
AU - Park, Hyeyoung
AU - Lee, Kwanyong
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/3/18
Y1 - 2019/3/18
N2 - Natural gradient learning, which is one of gradient descent learning methods, is known to have ideal convergence properties in the learning of hierarchical machines such as layered neural networks. However, there are a few limitations that degrades its practical usability: necessity of true probability density function of input variables and heavy computational cost due to matrix inversion. Though its adaptive approximation have been developed, it is basically derived for online learning mode, in which a single update is done for a single data sample. Noting that the on-line learning mode is not appropriate for the tasks with huge number of training data, this paper proposes a practical implementation of natural gradient for mini-batch learning mode, which is the most common setting in the real application with large data set. Computational experiments on benchmark datasets shows the efficiency of the proposed methods.
AB - Natural gradient learning, which is one of gradient descent learning methods, is known to have ideal convergence properties in the learning of hierarchical machines such as layered neural networks. However, there are a few limitations that degrades its practical usability: necessity of true probability density function of input variables and heavy computational cost due to matrix inversion. Though its adaptive approximation have been developed, it is basically derived for online learning mode, in which a single update is done for a single data sample. Noting that the on-line learning mode is not appropriate for the tasks with huge number of training data, this paper proposes a practical implementation of natural gradient for mini-batch learning mode, which is the most common setting in the real application with large data set. Computational experiments on benchmark datasets shows the efficiency of the proposed methods.
KW - Gradient descent learning
KW - Mini-batch learning mode
KW - Natural gradient learning
KW - Neural networks
KW - On-line learning
UR - http://www.scopus.com/inward/record.url?scp=85063896935&partnerID=8YFLogxK
U2 - 10.1109/ICAIIC.2019.8669082
DO - 10.1109/ICAIIC.2019.8669082
M3 - Conference contribution
AN - SCOPUS:85063896935
T3 - 1st International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2019
SP - 306
EP - 310
BT - 1st International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1st International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2019
Y2 - 11 February 2019 through 13 February 2019
ER -