SessionDeep Learning
Authors
Event Type
Paper

Accelerators
Applications
Deep Learning
Machine Learning
TimeTuesday, November 14th11am -
11:30am
Location402-403-404
DescriptionSpecialized hardware accelerators have been proposed to
accelerate the execution of DNN algorithms for
high-performance and energy efficiency. Recently, they
have been deployed in datacenters (potentially for
business-critical or industrial applications) and
safety-critical systems such as self-driving cars. Soft
errors caused by high-energy particles have been
increasing in hardware systems, and these can lead to
catastrophic failures in DNN systems. Traditional
methods for building resilient systems, e.g., Triple
Modular Redundancy (TMR), are agnostic of DNN. Hence,
these approaches incur high overheads, which makes them
challenging to deploy. In this paper, we experimentally
evaluate the resilience characteristics of DNN systems
(i.e., DNN software running on specialized
accelerators). We find that the error resilience of a
DNN system depends on the data types, values, data
reuses, and the types of layers in the design. Based on
our observations, we propose two efficient protection
techniques for DNN systems.
Download PDF:
here