Abstract:
Von Neumann-style information processing systems — in which a “memory” delivers operations and then operands to a dedicated “compute unit” — are the basis of modern computer architectures. With the help of Moore’s Law and Dennard scaling, the throughput of these compute unit increased dramatically over the past 50 years, far exceeding the pace of improvements in data communication between memory and compute. As a result, the “Von Neumann bottleneck” now dominates considerations of system throughput and energy consumption, especially for Deep Neural Network (DNN) workloads. Non-Von Neumann architectures, such as those that move computation to the edge of memory crossbar arrays, can significantly reduce the cost of data communication.
Crossbar arrays of resistive non-volatile memories (NVM) offer a novel solution for deep learning tasks by computing matrix-vector multiplication in analog memory arrays [1]. The highly parallel structure and computation at the location of the data enables fast and energy-efficient multiply-accumulate computations, which are the workhorse operations within most deep learning algorithms. In this presentation, we will focus on our implementation of an analog memory cell based on Phase-Change Memory (PCM) and 3-Transistor 1-Capacitor (3T1C) for training [2] and PCM-based cells for inference [3]. In both cases, DNN weights are stored within large device arrays as analog conductances. Software-equivalent accuracy on various datasets has been achieved in a mixed software-hardware demonstration despite the considerable imperfections of existing NVM devices, such as noise and variability. We will discuss the device, circuit and system needs, as well as the performance outlook for further technology development [4].
References
[1] G. W. Burr et al., “Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses), using phase-change memory as the synaptic weight element” IEDM Tech. Digest, 29.5 (2014).
[2] S. Ambrogio et al., “Equivalent-Accuracy Accelerated Neural Network Training using Analog Memory”, Nature, 558 (7708), 60 (2018).
[3] H. Tsai et al., “Inference of Long-Short-Term Memory networks at software-equivalent accuracy using 2.5M analog Phase Change Memory devices”, VLSI, T8-1 (2019).
[4] H.-Y. Chang et al., “AI hardware acceleration with analog memory: micro-architectures for low energy at high speed,” IBM Journal of Research and Development, 63 (6), 8:1-14 (2019).
Bio:
Dr. Tsai received her Ph.D. from the Electrical Engineering and Computer Science department at Massachusetts Institute of Technology in 2011 and joined IBM as a research staff member. Sidney currently works in the Almaden Research Center in San Jose, CA, applying PCM-based devices for neuromorphic computing. Leveraging training capability and error tolerance in deep neural networks (DNN), matrix-vector multiplication and network weight update operations can be achieved in constant time at low power in a memory cross-bar arrays. The group demonstrated software-equivalent accuracies for a variety of classic DNN networks and datasets, for both training and inference. Before joining the neuromorphic computing group, Sidney worked in the IBM T.J. Watson Research Center in Yorktown Heights, NY, where she is developed next generation lithography for circuit applications with directed self-assembly (DSA) and managed the Advanced Lithography group in the Microelectronics Research Laboratory.