SessionInvited Talks 5
Presenter
Event Type
Invited Talk

TimeThursday, November 16th11:15am -
12pm
LocationMile High Ballroom
DescriptionGeneral-purpose digital systems for computing have
benefited from favorable scaling for decades, but are
now hitting a wall in energy efficiency. There is
consequently a growing interest in more efficient
computing systems that may be specialized in function
but worth the tradeoffs of lower energy consumption and
higher performance in certain demanding applications
such as artificial neural networks. In particular,
several highly efficient architectures with mixed analog
implementations utilizing emerging nonvolatile
technologies such as memristors have recently been
proposed to accelerate neural network computations.
In this talk, I will present our work implementing a prototype hardware accelerator dot product engine (DPE) for vector-matrix multiplication (VMM). VMM is a bottleneck for many applications, particularly in neural networks, and can be performed in the analog domain using Ohm’s law for multiplication and Kirchoff’s current law for summation. The DPE performs VMM in a single step by applying a vector of voltages to the rows and reading the currents on the columns of a memristor crossbar, which stores real-valued matrix elements as device conductances. We have demonstrated high-precision analog tuning and control of memristor cells across a 128x64 crossbar array and evaluated the resulting VMM accuracy in our DPE prototype. We also performed single-layer neural network inference in the DPE for the 10k MNIST handwritten digit test patterns and assessed the performance comparison to a digital approach. Finally, I will discuss forecasted computational efficiency of scaled and integrated DPEs on chip (>100 TOPS/W), other computational applications of the DPE, and our work on generalized accelerator architectures based on DPE for broader algorithms and more complex functions.
In this talk, I will present our work implementing a prototype hardware accelerator dot product engine (DPE) for vector-matrix multiplication (VMM). VMM is a bottleneck for many applications, particularly in neural networks, and can be performed in the analog domain using Ohm’s law for multiplication and Kirchoff’s current law for summation. The DPE performs VMM in a single step by applying a vector of voltages to the rows and reading the currents on the columns of a memristor crossbar, which stores real-valued matrix elements as device conductances. We have demonstrated high-precision analog tuning and control of memristor cells across a 128x64 crossbar array and evaluated the resulting VMM accuracy in our DPE prototype. We also performed single-layer neural network inference in the DPE for the 10k MNIST handwritten digit test patterns and assessed the performance comparison to a digital approach. Finally, I will discuss forecasted computational efficiency of scaled and integrated DPEs on chip (>100 TOPS/W), other computational applications of the DPE, and our work on generalized accelerator architectures based on DPE for broader algorithms and more complex functions.
Presenter