TY - GEN
T1 - Flash-Cosmos
T2 - 55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022
AU - Park, Jisung
AU - Azizi, Roknoddin
AU - Oliveira, Geraldo F.
AU - Sadrosadati, Mohammad
AU - Nadig, Rakesh
AU - Novo, David
AU - Gomez-Luna, Juan
AU - Kim, Myungsuk
AU - Mutlu, Onur
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Bulk bitwise operations, i. e., bitwise operations on large bit vectors, are prevalent in a wide range of important application domains, including databases, graph processing, genome analysis, cryptography, and hyper-dimensional computing. In conventional systems, the performance and energy efficiency of bulk bitwise operations are bottlenecked by data movement between the compute units (e.g., CPUs and GPUs) and the memory hierarchy. In-flash processing (i. e., processing data inside NAND flash chips) has a high potential to accelerate bulk bitwise operations by fundamentally reducing data movement through the entire memory hierarchy, especially when the processed data does not fit into main memory. We identify two key limitations of the state-of-the-art in-flash processing technique for bulk bitwise operations; (i) it falls short of maximally exploiting the bit-level parallelism of bulk bitwise operations that could be enabled by leveraging the unique cell-array architecture and operating principles of NAND flash memory; (ii) it is unreliable because it is not designed to take into account the highly error-prone nature of NAND flash memory. We propose Flash-Cosmos (Flash C omputation with-O ne-S hot M ulti-O perand S ensing), a new in-flash processing technique that significantly increases the performance and energy efficiency of bulk bitwise operations while providing high reliability. Flash-Cosmos introduces two key mechanisms that can be easily supported in modern NAND flash chips: (i) M ulti-W ordline S ensing (MWS), which enables bulk bitwise operations on a large number of operands (tens of operands) with a single sensing operation, and (ii) E nhanced S LC-mode P rogramming (ESP), which enables reliable computation inside NAND flash memory. We demonstrate the feasibility of performing bulk bitwise operations with high reliability in Flash-Cosmos by testing 160 real 3D NAND flash chips. Our evaluation shows that Flash-Cosmos improves average performance and energy efficiency by 3.5 × /32 × and 3.3 × /95 ×, respectively, over the state-of-the-art in-flash/outside-storage processing techniques across three real-world applications.
AB - Bulk bitwise operations, i. e., bitwise operations on large bit vectors, are prevalent in a wide range of important application domains, including databases, graph processing, genome analysis, cryptography, and hyper-dimensional computing. In conventional systems, the performance and energy efficiency of bulk bitwise operations are bottlenecked by data movement between the compute units (e.g., CPUs and GPUs) and the memory hierarchy. In-flash processing (i. e., processing data inside NAND flash chips) has a high potential to accelerate bulk bitwise operations by fundamentally reducing data movement through the entire memory hierarchy, especially when the processed data does not fit into main memory. We identify two key limitations of the state-of-the-art in-flash processing technique for bulk bitwise operations; (i) it falls short of maximally exploiting the bit-level parallelism of bulk bitwise operations that could be enabled by leveraging the unique cell-array architecture and operating principles of NAND flash memory; (ii) it is unreliable because it is not designed to take into account the highly error-prone nature of NAND flash memory. We propose Flash-Cosmos (Flash C omputation with-O ne-S hot M ulti-O perand S ensing), a new in-flash processing technique that significantly increases the performance and energy efficiency of bulk bitwise operations while providing high reliability. Flash-Cosmos introduces two key mechanisms that can be easily supported in modern NAND flash chips: (i) M ulti-W ordline S ensing (MWS), which enables bulk bitwise operations on a large number of operands (tens of operands) with a single sensing operation, and (ii) E nhanced S LC-mode P rogramming (ESP), which enables reliable computation inside NAND flash memory. We demonstrate the feasibility of performing bulk bitwise operations with high reliability in Flash-Cosmos by testing 160 real 3D NAND flash chips. Our evaluation shows that Flash-Cosmos improves average performance and energy efficiency by 3.5 × /32 × and 3.3 × /95 ×, respectively, over the state-of-the-art in-flash/outside-storage processing techniques across three real-world applications.
KW - bitwise operation
KW - in flash processing
KW - NAND flash memory
KW - near data processing
KW - solid state drive
UR - http://www.scopus.com/inward/record.url?scp=85138739514&partnerID=8YFLogxK
U2 - 10.1109/MICRO56248.2022.00069
DO - 10.1109/MICRO56248.2022.00069
M3 - Conference contribution
AN - SCOPUS:85138739514
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 937
EP - 955
BT - Proceedings - 2022 55th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2022
PB - IEEE Computer Society
Y2 - 1 October 2022 through 5 October 2022
ER -