TY - JOUR
T1 - Accelerating DSP Applications on a 16-Bit Processor
T2 - Block RAM Integration and Distributed Arithmetic Approach
AU - Bharathi, M.
AU - Mohanarangam, Krithikaa
AU - M Shirur, Yasha Jyothi
AU - Choi, Jun Rim
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2023/10
Y1 - 2023/10
N2 - Modern processors have improved performance but still face challenges such as power consumption, storage limitations, and the need for faster processing. The 16-bit Digital Signal Processors (DSPs) accelerate DSP applications by significantly enhancing speed and performance for tasks including audio processing, telecommunications, image and video processing, wireless communication, and consumer electronics. This paper presents a novel technique for accelerating DSP applications on a 16-bit processor by combining two methods: Block Random Access Memory (BRAM) and Distributed Arithmetic (DA). Integrating BRAM as a replacement for conventional RAM minimizes timing and critical route delays, improving processor efficiency and performance. Furthermore, the Distributed Arithmetic approach enhances performance and efficiency by utilizing precomputed lookup tables to expedite multiplication operations within the Arithmetic and Logic Unit (ALU). We use the Xilinx Vivado tool, a robust development environment for FPGA-based systems, for the design process and execute the hardware implementation using the Genesys2 Kintex board. The proposed work produces improved efficiency with a cycle per instruction of 2, where the delay is 2.009 ns, the critical path delay is 8.182 ns, and the power consumption is 4 mW.
AB - Modern processors have improved performance but still face challenges such as power consumption, storage limitations, and the need for faster processing. The 16-bit Digital Signal Processors (DSPs) accelerate DSP applications by significantly enhancing speed and performance for tasks including audio processing, telecommunications, image and video processing, wireless communication, and consumer electronics. This paper presents a novel technique for accelerating DSP applications on a 16-bit processor by combining two methods: Block Random Access Memory (BRAM) and Distributed Arithmetic (DA). Integrating BRAM as a replacement for conventional RAM minimizes timing and critical route delays, improving processor efficiency and performance. Furthermore, the Distributed Arithmetic approach enhances performance and efficiency by utilizing precomputed lookup tables to expedite multiplication operations within the Arithmetic and Logic Unit (ALU). We use the Xilinx Vivado tool, a robust development environment for FPGA-based systems, for the design process and execute the hardware implementation using the Genesys2 Kintex board. The proposed work produces improved efficiency with a cycle per instruction of 2, where the delay is 2.009 ns, the critical path delay is 8.182 ns, and the power consumption is 4 mW.
KW - 16-bit processor
KW - Block RAM (BRAM)
KW - Genesys2 Kintex
KW - Xilinx Vivado
KW - distributed arithmetic (DA)
UR - http://www.scopus.com/inward/record.url?scp=85175240086&partnerID=8YFLogxK
U2 - 10.3390/electronics12204236
DO - 10.3390/electronics12204236
M3 - Article
AN - SCOPUS:85175240086
SN - 2079-9292
VL - 12
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 20
M1 - 4236
ER -