DescriptionGeneral-purpose digital systems for computing have benefited from favorable scaling for decades, but are now hitting a wall in energy efficiency. There is consequently a growing interest in more efficient computing systems that may be specialized in function but worth the tradeoffs of lower energy consumption and higher performance in certain demanding applications such as artificial neural networks. In particular, several highly efficient architectures with mixed analog implementations utilizing emerging nonvolatile technologies such as memristors have recently been proposed to accelerate neural network computations.
In this talk, I will present our work implementing a prototype hardware accelerator dot product engine (DPE) for vector-matrix multiplication (VMM). VMM is a bottleneck for many applications, particularly in neural networks, and can be performed in the analog domain using Ohm’s law for multiplication and Kirchoff’s current law for summation. The DPE performs VMM in a single step by applying a vector of voltages to the rows and reading the currents on the columns of a memristor crossbar, which stores real-valued matrix elements as device conductances. We have demonstrated high-precision analog tuning and control of memristor cells across a 128x64 crossbar array and evaluated the resulting VMM accuracy in our DPE prototype. We also performed single-layer neural network inference in the DPE for the 10k MNIST handwritten digit test patterns and assessed the performance comparison to a digital approach. Finally, I will discuss forecasted computational efficiency of scaled and integrated DPEs on chip (>100 TOPS/W), other computational applications of the DPE, and our work on generalized accelerator architectures based on DPE for broader algorithms and more complex functions.