Abstract
Recent years have seen marked developments in deep neural networks (DNNs) stemming from advances in hardware and increasingly large datasets. DNNs are now routinely used in domains including computer vision and language processing. At their core, DNNs rely heavily on multiply-accumulate (MAC) operations making them well-suited for the highly parallel computational abilities of GPUs. GPUs, however, are von Neumann in architecture and physically separate memory blocks from computational blocks. This exacts an unavoidable time and energy cost associated with data transport known as the von-Neumann bottleneck. While incremental advances in digital hardware accelerators mitigating the von Neumann bottleneck will continue, we explore the potentially game-changing advantages of non-von Neumann architectures that perform MAC operations within the memory.
© 2019 IEEE
PDF ArticleMore Like This
Satoshi Sunada
12p_N404_9 JSAP-OSA Joint Symposia (JSAP) 2021
Abu Sebastian
jsi_1_1 European Quantum Electronics Conference (EQEC) 2019
Luat T. Vuong
FTh4A.3 Frontiers in Optics (FiO) 2021