TY - GEN
T1 - Work-in-Progress
T2 - 23rd ACM SIGBED International Conference on Embedded Software, EMSOFT 2023
AU - Kwon, Jisu
AU - Park, Daejin
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023
Y1 - 2023
N2 - The resource constraints of MCU-based platforms limits their ability to utilize high-performance accelerators such as GPUs or servers, mainly due to insufficient resources for ML applications. Currently, solutions utilizing accelerators connected as peripherals to the on-chip bus of microcontroller units (MCUs) are being proposed. We define this approach as a Micro-Accelerator (MA). Due to the necessity of connecting the MA to the MCU core and the on-chip bus within the chip, conducting a iterative full system evaluation of the embedded software that drives the MA poses significant challenges. To address this challenge, we propose a framework that enables rapid prototyping of custom-designed MA and facilitates profiling of its acceleration performance. Experimental results evaluating the performance of the MA for two tiny machine learning (TinyML) applications within the proposed framework demonstrate a cycle latency reduction of 84.32% and 61.32% compared to a general machine learning framework, respectively.
AB - The resource constraints of MCU-based platforms limits their ability to utilize high-performance accelerators such as GPUs or servers, mainly due to insufficient resources for ML applications. Currently, solutions utilizing accelerators connected as peripherals to the on-chip bus of microcontroller units (MCUs) are being proposed. We define this approach as a Micro-Accelerator (MA). Due to the necessity of connecting the MA to the MCU core and the on-chip bus within the chip, conducting a iterative full system evaluation of the embedded software that drives the MA poses significant challenges. To address this challenge, we propose a framework that enables rapid prototyping of custom-designed MA and facilitates profiling of its acceleration performance. Experimental results evaluating the performance of the MA for two tiny machine learning (TinyML) applications within the proposed framework demonstrate a cycle latency reduction of 84.32% and 61.32% compared to a general machine learning framework, respectively.
UR - http://www.scopus.com/inward/record.url?scp=85179848411&partnerID=8YFLogxK
U2 - 10.1145/3607890.3608461
DO - 10.1145/3607890.3608461
M3 - Conference contribution
AN - SCOPUS:85179848411
T3 - Proceedings - 2023 International Conference on Embedded Software, EMSOFT 2023
SP - 15
EP - 16
BT - Proceedings - 2023 International Conference on Embedded Software, EMSOFT 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 September 2023 through 22 September 2023
ER -