FAST '08 � Abstract
Pp. 223�238 of the Proceedings
Awarded Best Student Paper!
An Analysis of Data Corruption in the Storage Stack
Lakshmi N. Bairavasundaram, University of Wisconsin, Madison; Garth
Goodson, Network Appliance Inc.; Bianca Schroeder, University of Toronto; Andrea C. Arpaci-Dusseau and
Remzi H. Arpaci-Dusseau, University of Wisconsin, Madison
Abstract
An important threat to reliable storage of data is silent
data corruption. In order to develop suitable protection mechanisms
against data corruption, it is essential to understand its characteristics.
In this paper, we present the first large-scale study of
data corruption. We analyze corruption instances recorded in
production storage systems containing a total of million disk drives,
over a period of months. We study three
classes of corruption: checksum mismatches, identity discrepancies, and parity inconsistencies.
We focus on checksum mismatches since they occur the most.
We find more than 400,000 instances of
checksum mismatches over the 41-month period. We find many interesting trends among
these instances including:
(i) nearline disks (and their adapters) develop checksum mismatches an order of magnitude
more often than enterprise class disk drives,
(ii) checksum mismatches within the same disk are not independent events and
they show high spatial and temporal locality, and
(iii) checksum mismatches across different disks in the same storage system are not
independent.
We use our observations to derive lessons for corruption-proof
system design.
- View the full text of this paper in HTML and PDF.
Listen to the presentation in
MP3 format.
The Proceedings are published as a collective work, � 2008 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.
|