TY - JOUR
T1 - Elastic exponential linear units for convolutional neural networks
AU - Kim, Daeho
AU - Kim, Jinah
AU - Kim, Jaeil
N1 - Publisher Copyright:
© 2020 The Author(s)
PY - 2020/9/17
Y1 - 2020/9/17
N2 - Activation functions play important roles in determining the depth and non-linearity of deep learning models. Since the Rectified Linear Unit (ReLU) was introduced, many modifications, in which noise is intentionally injected, have been proposed to avoid overfitting. Exponential Linear Unit (ELU) and their variants, with trainable parameters, have been proposed to reduce the bias shift effects which are often observed in ReLU-type activation functions. In this paper, we propose a novel activation function, called the Elastic Exponential Linear Unit (EELU), which combines the advantages of both types of activation functions in a generalized form. EELU has an elastic slope in the positive part, and preserves the negative signal by using a small non-zero gradient. We also present a new strategy to insert neuronal noise using a Gaussian distribution in the activation function to improve generalization. We demonstrated how EELU can represent a wider variety of features with random noise than other activation functions, by visualizing the latent features of convolutional neural networks. We evaluated the effectiveness of the EELU approach through extensive experiments with image classification using the CIFAR-10/CIFAR-100, ImageNet, and Tiny ImageNet datasets. Our experimental results show that EELU achieved better generalization performance and improved classification accuracy over conventional activation functions, such as ReLU, ELU, ReLU- and ELU-like variants, Scaled ELU, and Swish. EELU produced performance improvements in image classification using a smaller number of training samples, owing to its noise injection strategy, which allows significant variation in function outputs, including deactivation.
AB - Activation functions play important roles in determining the depth and non-linearity of deep learning models. Since the Rectified Linear Unit (ReLU) was introduced, many modifications, in which noise is intentionally injected, have been proposed to avoid overfitting. Exponential Linear Unit (ELU) and their variants, with trainable parameters, have been proposed to reduce the bias shift effects which are often observed in ReLU-type activation functions. In this paper, we propose a novel activation function, called the Elastic Exponential Linear Unit (EELU), which combines the advantages of both types of activation functions in a generalized form. EELU has an elastic slope in the positive part, and preserves the negative signal by using a small non-zero gradient. We also present a new strategy to insert neuronal noise using a Gaussian distribution in the activation function to improve generalization. We demonstrated how EELU can represent a wider variety of features with random noise than other activation functions, by visualizing the latent features of convolutional neural networks. We evaluated the effectiveness of the EELU approach through extensive experiments with image classification using the CIFAR-10/CIFAR-100, ImageNet, and Tiny ImageNet datasets. Our experimental results show that EELU achieved better generalization performance and improved classification accuracy over conventional activation functions, such as ReLU, ELU, ReLU- and ELU-like variants, Scaled ELU, and Swish. EELU produced performance improvements in image classification using a smaller number of training samples, owing to its noise injection strategy, which allows significant variation in function outputs, including deactivation.
KW - Activation function
KW - Convolutional neural network
KW - Elastic Exponential Linear Unit (EELU)
KW - ELU
KW - Gaussian noise
KW - ReLU
UR - http://www.scopus.com/inward/record.url?scp=85085103725&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2020.03.051
DO - 10.1016/j.neucom.2020.03.051
M3 - Article
AN - SCOPUS:85085103725
SN - 0925-2312
VL - 406
SP - 253
EP - 266
JO - Neurocomputing
JF - Neurocomputing
ER -