TY - JOUR
T1 - A comprehensive exploration of approximate DNN models with a novel floating-point simulation framework
AU - Kwak, Myeongjin
AU - Kim, Jeonggeun
AU - Kim, Yongtae
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/8
Y1 - 2024/8
N2 - This paper introduces TorchAxf1, a framework for fast simulation of diverse approximate deep neural network (DNN) models, including spiking neural networks (SNNs). The proposed framework utilizes various approximate adders and multipliers, supports industrial standard reduced precision floating-point formats, such as bfloat16, and accommodates user-customized precision representations. Leveraging GPU acceleration on the PyTorch framework, TorchAxf accelerates approximate DNN training and inference. In addition, it allows seamless integration of arbitrary approximate arithmetic algorithms with C/C++ behavioral models to emulate approximate DNN hardware accelerators. We utilize the proposed TorchAxf framework to assess twelve popular DNN models under approximate multiply-and-accumulate (MAC) operations. Through comprehensive experiments, we determine the suitable degree of floating-point arithmetic approximation for these DNN models without significant accuracy loss and offer the optimal reduced precision formats for each DNN model. Additionally, we demonstrate that approximate-aware re-training can rectify errors and enhance pre-trained DNN models under reduced precision formats. Furthermore, TorchAxf, operating on GPU, remarkably reduces simulation time for complex DNN models using approximate arithmetic by up to 131.38× compared to the baseline optimized CPU implementation. Finally, we compare the proposed framework with state-of-the-art frameworks to highlight its superiority.
AB - This paper introduces TorchAxf1, a framework for fast simulation of diverse approximate deep neural network (DNN) models, including spiking neural networks (SNNs). The proposed framework utilizes various approximate adders and multipliers, supports industrial standard reduced precision floating-point formats, such as bfloat16, and accommodates user-customized precision representations. Leveraging GPU acceleration on the PyTorch framework, TorchAxf accelerates approximate DNN training and inference. In addition, it allows seamless integration of arbitrary approximate arithmetic algorithms with C/C++ behavioral models to emulate approximate DNN hardware accelerators. We utilize the proposed TorchAxf framework to assess twelve popular DNN models under approximate multiply-and-accumulate (MAC) operations. Through comprehensive experiments, we determine the suitable degree of floating-point arithmetic approximation for these DNN models without significant accuracy loss and offer the optimal reduced precision formats for each DNN model. Additionally, we demonstrate that approximate-aware re-training can rectify errors and enhance pre-trained DNN models under reduced precision formats. Furthermore, TorchAxf, operating on GPU, remarkably reduces simulation time for complex DNN models using approximate arithmetic by up to 131.38× compared to the baseline optimized CPU implementation. Finally, we compare the proposed framework with state-of-the-art frameworks to highlight its superiority.
KW - Approximate computing
KW - Deep neural network (DNN)
KW - Floating-point
KW - GPU
KW - PyTorch
KW - Spiking neural network (SNN)
UR - http://www.scopus.com/inward/record.url?scp=85194925173&partnerID=8YFLogxK
U2 - 10.1016/j.peva.2024.102423
DO - 10.1016/j.peva.2024.102423
M3 - Article
AN - SCOPUS:85194925173
SN - 0166-5316
VL - 165
JO - Performance Evaluation
JF - Performance Evaluation
M1 - 102423
ER -