SC17 Denver, CO

P55: Incorporating Proactive Data Rescue into ZFS Disk Recovery for Enhanced Storage Reliability

Authors: Zhi Qiao (University of North Texas), Song Fu (University of North Texas), Hsing-bung Chen (Los Alamos National Laboratory), Michael Lang (Los Alamos National Laboratory)

Abstract: As tremendous amount of data are generated every day, storage systems store exabytes of data on hundreds of thousands of disk drives. At such a scale, disk failures become the norm. Data recovery takes longer time due to increased disk capacity. ZFS is a widely used filesystem, providing data recovery from corruption. Many factors may affect ZFS's recovery performance in a production environment. Additionally, disk failure prediction techniques enables ZFS to proactively rescue data prior to disk failures. In this poster, we extensively evaluate the recovery performance with a variety of ZFS configurations. We also compare the performance of different data rescue strategies, including post-failure disk recovery, proactive disk cloning, and proactive data recovery. Our proposed analytic model uses the collected zpool utilization data and system configuration to derive the optimal data rescue strategy that best suits the storage array in the current state.
Award: Best Poster Finalist (BP): no

Poster: pdf
Two-page extended abstract: pdf

Poster Index