TY - GEN
T1 - Mosaic
T2 - 28th International Conference on Parallel Architectures and Compilation Techniques, PACT 2019
AU - Han, Myeonggyun
AU - Hyun, Jihoon
AU - Park, Seongbeom
AU - Park, Jinsu
AU - Baek, Woongki
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciously considering their performance and energy heterogeneity, communication overheads, and constraints. To bridge this gap, we propose MOSAIC, heterogeneity-, communication-, and constraint-Aware model slicing and execution for accurate and efficient inference on heterogeneous embedded systems. MOSAIC generates the efficient model slicing and execution plan for the target inference workload through dynamic programming. MOSAIC significantly reduces inference latency and energy, exhibits high estimation accuracy, and incurs small overheads.
AB - Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciously considering their performance and energy heterogeneity, communication overheads, and constraints. To bridge this gap, we propose MOSAIC, heterogeneity-, communication-, and constraint-Aware model slicing and execution for accurate and efficient inference on heterogeneous embedded systems. MOSAIC generates the efficient model slicing and execution plan for the target inference workload through dynamic programming. MOSAIC significantly reduces inference latency and energy, exhibits high estimation accuracy, and incurs small overheads.
KW - Heterogeneous Embedded Systems
KW - Inference
KW - Model Slicing and Execution
UR - https://www.scopus.com/pages/publications/85075455072
U2 - 10.1109/PACT.2019.00021
DO - 10.1109/PACT.2019.00021
M3 - Conference contribution
AN - SCOPUS:85075455072
T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
SP - 165
EP - 177
BT - Proceedings - 2019 28th International Conference on Parallel Architectures and Compilation Techniques, PACT 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 21 September 2019 through 25 September 2019
ER -