This Article Has Multiple Issues. Please Help: Improve It Talk Page References or Sources Verification
This Article Has Multiple Issues. Please Help: Improve It Talk Page References or Sources Verification
• Its Criticism or Controversy section(s) may mean the article does not
present a neutral point of view of the subject. Tagged since February 2010.
This article is about the data storage technology. For other uses, see Raid
(disambiguation).
RAID is now used as an umbrella term for computer data storage schemes that can divide
and replicate data among multiple disk drives. The schemes or architectures are named by
the word RAID followed by a number (e.g., RAID 0, RAID 1). The various designs of
RAID systems involve two key goals: increase data reliability and increase input/output
performance. When multiple physical disks are set up to use RAID technology, they are
said to be in a RAID array.[3] This array distributes data across multiple disks, but the
array is addressed by the operating system as one single disk. RAID can be set up to
serve several different purposes.
Contents
[hide]
• 1 Standard levels
• 2 Nested (hybrid) RAID
• 3 RAID Parity
• 4 RAID 10 versus RAID 5 in Relational Databases
• 5 New RAID classification
• 6 Non-standard levels
• 7 Data backup
• 8 Implementations
o 8.1 Software-based RAID
o 8.2 Hardware-based RAID
o 8.3 Firmware/driver-based RAID
o 8.4 Network-attached storage
o 8.5 Hot spares
• 9 Reliability terms
• 10 Problems with RAID
o 10.1 Correlated failures
o 10.2 Atomicity
o 10.3 Write cache reliability
o 10.4 Equipment compatibility
o 10.5 Data recovery in the event of a failed array
o 10.6 Drive error recovery algorithms
o 10.7 Increasing recovery time
o 10.8 Operator skills, correct operation
o 10.9 Other problems and viruses
• 11 History
• 12 Vinum
• 13 Software RAID vs. Hardware RAID
• 14 Non-RAID drive architectures
• 15 See also
• 16 References
• 17 Further reading
• 18 External links
A number of standard schemes have evolved which are referred to as levels. There were
five RAID levels originally conceived, but many more variations have evolved, notably
several nested levels and many non-standard levels (mostly proprietary).
Following is a brief textual summary of the most commonly used RAID levels.[4]
The following table provides an overview of the most important parameters of standard
RAID levels. Space efficiency is given as an equation in terms of the number of drives, n,
which results in a value between 0 and 1, representing the fraction of the sum of the
drives' capacities that is available for use. For example, if three drives are arranged in
RAID 3, this gives a space efficiency of 1−(1/3) = 0.66. If their individual capacities are
250 GB each, for a total of 750 GB over the three, the usable capacity under RAID 3 for
data storage is 500 GB.
Block-
level
RAID striping
2 1 0 (none) nX nX
0 without
parity or
mirroring.
Mirroring
RAID without
2 1/n n−1 disks nX 1X
1 parity or
striping.
1 disk
Bit-level when the
striping corrupt
with disk is
RAID 1 − 1/n ⋅
dedicated 3 found by
2 log2(n-1)
Hamming- the ( )
code recover-
parity. record
code.
Byte-level
striping
RAID
with 3 1 − 1/n 1 disk
3
dedicated
parity.
Block-
level
RAID striping
3 1 − 1/n 1 disk
4 with
dedicated
parity.
Block-
level
RAID striping (n−1) variabl
3 1 − 1/n 1 disk
5 with X e
distributed
parity.
Block-
level
striping
RAID
with 4 1 − 2/n 2 disks
6
double
distributed
parity.
In what was originally termed hybrid RAID,[5] many storage controllers allow RAID
levels to be nested. The elements of a RAID may be either individual disks or RAIDs
themselves. Nesting more than two deep is unusual.
As there is no basic RAID level numbered larger than 9, nested RAIDs are usually
unambiguously described by attaching the numbers indicating the RAID levels,
sometimes with a "+" in between. The order of the digits in a nested RAID designation is
the order in which the nested array is built: for RAID 1+0 first pairs of drives are
combined into two or more RAID 1 arrays (mirrors), and then the resulting RAID 1
arrays are combined into a RAID 0 array (stripes). It is also possible to combine stripes
into mirrors (RAID 0+1). The final step is known as the top array. When the top array is a
RAID 0 (such as in RAID 10 and RAID 50) most vendors omit the "+", though RAID
5+0 is clearer.
• RAID 0+1: striped sets in a mirrored set ( minimum four disks; even number of
disks) provides fault tolerance and improved performance but increases
complexity.
The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set
to mirror a primary striped set. The array continues to operate with one or more
drives failed in the same mirror set, but if drives fail on both sides of the mirror
the data on the RAID system is lost.
• RAID 1+0: mirrored sets in a striped set (minimum two disks but more commonly
four disks to take advantage of speed benefits; even number of disks) provides
fault tolerance and improved performance but increases complexity.
The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a
series of mirrored drives. In a failed disk situation, RAID 1+0 performs better
because all the remaining disks continue to be used. The array can sustain
multiple drive losses so long as no mirror loses all its drives.
• RAID 5+1: mirrored striped set with distributed parity (some manufacturers label
this as RAID 53).
Whether an array runs as RAID 0+1 or RAID 1+0 in practice is often determined by the
evolution of the storage system. A RAID controller might support upgrading a RAID 1
array to a RAID 1+0 array on the fly, but require a lengthy offline rebuild to upgrade
from RAID 1 to RAID 0+1. With nested arrays, sometimes the path of least disruption
prevails over achieving the preferred configuration.
It is actually very simple. In Boolean logic, there is a principle called "exclusive or", or
shorthand, "XOR", meaning "one or the other, but not neither nor both." For example:
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
The XOR operator is central to how parity data is created and used within an array; It is
used both for the protection of data, as well as for the recovery of missing data.
Let's suppose for the sake of simplicity that we have simple RAID made up of 6 hard
disks (4 for data, 1 for parity, and 1 for use as hot spare), where each drive is capable of
holding just a single byte worth of storage. This is how our initial RAID configuration
would look, keeping in mind no data has yet been written to it:
Now, let's write some random bits to each of our four data drives.
Every time we write anything on our data drives, we need to calculate parity to ensure we
can recover if we have a disk failure. To calculate the parity for this RAID, we simply
take the XOR of each drive's data. The resulting value is our parity data.
We now know that "11100110" is our parity data. We can write that data to our dedicated
parity drive:
Now, lets suppose one of those drives has failed. You can pick any, but, for this example,
let's say that Drive #3 has failed. In order to know what the contents of Drive #3 were, we
perform the same XOR calculation against all the remaining drives, and substituting our
parity value (11100110) in place of the missing/dead drive:
00101010 XOR (Drive 1 byte)
10001110 XOR (Drive 2 byte)
11100110 XOR (Parity byte in place of failed Drive 3 byte)
10110101 (Drive 4 byte)
With the complete contents of Drive #3 now successfully recovered, the data is written to
the hot spare, and the RAID can continue operating as it had before.
Normally, someone at this point will replace the dead drive with a working one of the
same size. When this happens, the hot spare's contents are then automatically copied to it
by the array controller, allowing the hot spare to return to its original purpose as an
emergency standby drive. The resulting array is identical to its pre-failure state:
This same basic XOR principle applies to parity within RAID groups regardless of
capacity or number of drives. As long as there are enough drives present to allow for an
XOR calculation to take place, parity can be used to recover data from any single drive
failure. (A minimum of three drives must be present in order for parity to be used for
fault tolerance, since the XOR operator requires two operands, and a place to store the
result.)
In the vast majority of enterprise-level SAN hardware, any writes which are generated by
the host are simply acknowledged immediately, and destaged to disk on the back end
when the controller sees fit to do so. From the host's perspective, an individual write to a
RAID 10 volume is no faster than an individual write to a RAID 5 volume; A difference
between the two only becomes apparent when write cache at the SAN controller level is
overwhelmed, and the SAN appliance must reject or gate further write requests in order
to allow write buffers on the controller to destage to disk. While rare, this generally
indicates poor performance management on behalf of the SAN administrator, not a
shortcoming of RAID 5 or RAID 10. SAN appliances generally service multiple hosts
which compete both for controller cache and spindle time with one another. This
contention is largely masked, in that the controller is generally intelligent and adaptive
enough to maximize read cache hit ratios while also maximizing the process of destaging
data from write cache.
The choice of RAID 10 versus RAID 5 for the purposes of housing a relational database
will depend upon a number of factors (spindle availability, cost, business risk, etc.) but,
from a performance standpoint, it depends mostly on the type of I/O that database can
expect to see. For databases that are expected to be exclusively or strongly read-biased,
RAID 10 is often chosen in that it offers a slight speed improvement over RAID 5 on
sustained reads. If a database is expected to be strongly write-biased, RAID 5 becomes
the more attractive option, since RAID 5 doesn't suffer from the same write handicap
inherent in RAID 10; All spindles in a RAID 5 can be utilized to write simultaneously,
whereas only half the members of a RAID 10 can be used . [4] However, for reasons
similar to what has eliminated the "write penalty" in RAID 5, the reduced ability of a
RAID 10 to handle sustained writes has been largely masked by improvements in
controller cache efficiency and disk throughput.
What causes RAID 5 to be slightly slower than RAID 10 on sustained reads is the fact
that RAID 5 has parity data interleaved within normal data. For every read pass in RAID
5, there is a probability that a read head may need to traverse a region of parity data. The
cumulative effect of this is a slight performance drop compared to RAID 10, which does
not use parity, and therefore will never encounter a circumstance where data underneath a
head is of no use. For the vast majority of situations, however, most relational databases
housed on RAID 10 perform equally well in RAID 5. The strengths and weaknesses of
each type only become an issue in atypical deployments, or deployments on
overcommitted or outdated hardware.[5]
There are, however, other considerations which must be taken into account other than
simply those regarding performance. RAID 5 and other non-mirror-based arrays offer a
lower degree of resiliency than RAID 10 by virtue of RAID 10's mirroring strategy. In a
RAID 10, I/O can continue even in spite of multiple drive failures. By comparison, in a
RAID 5 array, any simultaneous failure involving greater than one drive will render the
array itself unusable by virtue of parity recalculation being impossible to perform. For
many, particularly in mission-critical environments with enough capital to spend, RAID
10 becomes the favorite as it provides the lowest level of risk.[6]
Again, modern SAN design largely masks any performance hit while the RAID array is
in a degraded state, by virtue of selectively being able to perform rebuild operations both
in-band or out-of-band with respect to existing I/O traffic. Given the rare nature of drive
failures in general, and the exceedingly low probability of multiple concurrent drive
failures occurring within the same RAID array, the choice of RAID 5 over RAID 10
often comes down to the preference of the storage administrator, particularly when
weighed against other factors such as cost, throughput requirements, and physical spindle
availability. [8]
The original "Berkeley" RAID classifications are still kept as an important historical
reference point and also to recognize that RAID Levels 0-6 successfully define all known
data mapping and protection schemes for disk. Unfortunately, the original classification
caused some confusion due to assumption that higher RAID levels imply higher
redundancy and performance. This confusion was exploited by RAID system
manufacturers, and gave birth to the products with such names as RAID-7, RAID-10,
RAID-30, RAID-S, etc. The new system describes the data availability characteristics of
the RAID system rather than the details of its implementation.
The next list provides criteria for all three classes of RAID:
1. Protection against data loss and loss of access to data due to disk drive failure
2. Reconstruction of failed drive content to a replacement drive
3. Protection against data loss due to a "write hole"
4. Protection against data loss due to host and host I/O bus failure
5. Protection against data loss due to replaceable unit failure
6. Replaceable unit monitoring and failure indication
16. Protection against loss of access to data due to host and host I/O bus failure
17. Protection against loss of access to data due to external power failure
18. Protection against loss of access to data due to component replacement
19. Protection against loss of data and loss of access to data due to multiple disk failure
20. Protection against loss of access to data due to zone failure
21. Long-distance protection against loss of data due to zone failure
Many configurations other than the basic numbered RAID levels are possible, and many
companies, organizations, and groups have created their own non-standard
configurations, in many cases designed to meet the specialised needs of a small niche
group. Most of these non-standard RAID levels are proprietary.
RAID drives can serve as excellent backup drives when employed as removable backup
devices to main storage, and particularly when located offsite from the main systems.
However, the use of RAID as the only storage solution does not replace backups.
[edit] Implementations
It has been suggested that Vinum volume manager be merged into this article or
section. (Discuss)
The distribution of data across multiple drives can be managed either by dedicated
hardware or by software. When done in software the software may be part of the
operating system or it may be part of the firmware and drivers supplied with the card.
[edit] Software-based RAID
• Apple's Mac OS X Server[11] and Mac OS X[12] support RAID 0, RAID 1 and
RAID 1+0.
• FreeBSD supports RAID 0, RAID 1, RAID 3, and RAID 5 and all layerings of the
above via GEOM modules[13][14] and ccd.,[15] as well as supporting RAID 0, RAID
1, RAID-Z, and RAID-Z2 (similar to RAID 5 and RAID 6 respectively), plus
nested combinations of those via ZFS.
• Linux supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6 and all layerings of
the above, as well as "RAID10" (see above).[16][17] Certain
reshaping/resizing/expanding operations are also supported.[18]
• Microsoft's server operating systems support RAID 0, RAID 1, and RAID 5.
Some of the Microsoft desktop operating systems support RAID such as
Windows XP Professional which supports RAID level 0 in addition to spanning
multiple disks but only if using dynamic disks and volumes. Windows XP
supports RAID 0, 1, and 5 with a simple file patch.[19] RAID functionality in
Windows is slower than hardware RAID, but allows a RAID array to be moved to
another machine with no compatibility issues.
• NetBSD supports RAID 0, RAID 1, RAID 4 and RAID 5 (and any nested
combination of those like 1+0) via its software implementation, named
RAIDframe.
• OpenBSD aims to support RAID 0, RAID 1, RAID 4 and RAID 5 via its software
implementation softraid.
• Solaris ZFS supports ZFS equivalents of RAID 0, RAID 1, RAID 5 (RAID Z),
RAID 6 (RAID Z2), and a triple parity version RAID Z3, and any nested
combination of those like 1+0. Note that RAID Z/Z2/Z3 solve the RAID 5/6 write
hole problem and are therefore particularly suited to software implementation
without the need for battery backed cache (or similar) support. The boot
filesystem is limited to RAID 1.
• Solaris SVM supports RAID 1 for the boot filesystem, and adds RAID 0 and
RAID 5 support (and various nested combinations) for data drives.
• Linux and Windows FlexRAID is a snapshot RAID implementation.
• HP's OpenVMS provides a form of RAID 1 called "Volume shadowing", giving
the possibility to mirror data locally and at remote cluster systems.
Software RAID has advantages and disadvantages compared to hardware RAID. The
software must run on a host server attached to storage, and server's processor must
dedicate processing time to run the RAID software. The additional processing capacity
required for RAID 0 and RAID 1 is low, but parity-based arrays require more complex
data processing during write or integrity-checking operations. As the rate of data
processing increases with the number of disks in the array, so does the processing
requirement. Furthermore all the buses between the processor and the disk controller
must carry the extra data required by RAID which may cause congestion.
Over the history of hard disk drives, the increase in speed of commodity CPUs has been
consistently greater than the increase in speed of hard disk drive throughput.[20] Thus,
over-time for a given number of hard disk drives, the percentage of host CPU time
required to saturate a given number of hard disk drives has been dropping. e.g. The Linux
software md RAID subsystem is capable of calculating parity information at 6 GB/s
(100% usage of a single core on a 2.1 GHz Intel "Core2" CPU as of Linux v2.6.26). A
three-drive RAID 5 array using hard disks capable of sustaining a write of 100 MB/s will
require parity to be calculated at the rate of 200 MB/s. This will require the resources of
just over 3% of a single CPU core during write operations (parity does not need to be
calculated for read operations on a RAID 5 array, unless a drive has failed).
Another concern with operating system-based RAID is the boot process. It can be
difficult or impossible to set up the boot process such that it can fall back to another drive
if the usual boot drive fails. Such systems can require manual intervention to make the
machine bootable again after a failure. There are exceptions to this, such as the LILO
bootloader for Linux, loader for FreeBSD,[21] and some configurations of the GRUB
bootloader natively understand RAID 1 and can load a kernel. If the BIOS recognizes a
broken first disk and refers bootstrapping to the next disk, such a system will come up
without intervention, but the BIOS might or might not do that as intended. A hardware
RAID controller typically has explicit programming to decide that a disk is broken and
fall through to the next disk.
Hardware RAID controllers can also carry battery-powered cache memory. For data
safety in modern systems the user of software RAID might need to turn the write-back
cache on the disk off (but some drives have their own battery/capacitors on the write-
back cache, a UPS, and/or implement atomicity in various ways, etc.). Turning off the
write cache has a performance penalty that can, depending on workload and how well
supported command queuing in the disk system is, be significant. The battery backed
cache on a RAID controller is one solution to have a safe write-back cache.
Finally operating system-based RAID usually uses formats specific to the operating
system in question so it cannot generally be used for partitions that are shared between
operating systems as part of a multi-boot setup. However, this allows RAID disks to be
moved from one computer to a computer with an operating system or file system of the
same type, which can be more difficult when using hardware RAID (e.g. #1: When one
computer uses a hardware RAID controller from one manufacturer and another computer
uses a controller from a different manufacturer, drives typically cannot be interchanged.
e.g. #2: If the hardware controller 'dies' before the disks do, data may become
unrecoverable unless a hardware controller of the same type is obtained, unlike with
firmware-based or software-based RAID).
Hardware RAID controllers use different, proprietary disk layouts, so it is not usually
possible to span controllers from different manufacturers. They do not require processor
resources, the BIOS can boot from them, and tighter integration with the device driver
may offer better error handling.
Most hardware implementations provide a read/write cache, which, depending on the I/O
workload, will improve performance. In most systems the write cache is non-volatile (i.e.
battery-protected), so pending writes are not lost on a power failure.
Hardware implementations also typically support hot swapping, allowing failed drives to
be replaced while the system is running.
However, inexpensive hardware RAID controllers can be slower than software RAID due
to the dedicated CPU on the controller card not being as fast as the CPU in the
computer/server. More expensive RAID controllers have faster CPUs, capable of higher
throughput speeds and do not present this slowness.
Operating system-based RAID doesn't always protect the boot process and is generally
impractical on desktop versions of Windows (as described above). Hardware RAID
controllers are expensive and proprietary. To fill this gap, cheap "RAID controllers" were
introduced that do not contain a RAID controller chip, but simply a standard disk
controller chip with special firmware and drivers. During early stage bootup the RAID is
implemented by the firmware; when a protected-mode operating system kernel such as
Linux or a modern version of Microsoft Windows is loaded the drivers take over.
Both hardware and software RAIDs with redundancy may support the use of hot spare
drives, a drive physically installed in the array which is inactive until an active drive fails,
when the system automatically replaces the failed drive with the spare, rebuilding the
array with the spare drive included. This reduces the mean time to recovery (MTTR),
though it doesn't eliminate it completely. Subsequent additional failure(s) in the same
RAID redundancy group before the array is fully rebuilt can result in loss of the data;
rebuilding can take several hours, especially on busy systems.
Rapid replacement of failed drives is important as the drives of an array will all have had
the same amount of use, and may tend to fail at about the same time rather than
randomly.[citation needed] RAID 6 without a spare uses the same number of drives as RAID 5
with a hot spare and protects data against simultaneous failure of up to two drives, but
requires a more advanced RAID controller. Further, a hot spare can be shared by multiple
RAID sets.
The theory behind the error correction in RAID assumes that failures of drives are
independent. Given these assumptions it is possible to calculate how often they can fail
and to arrange the array to make data loss arbitrarily improbable.
In practice, the drives are often the same ages, with similar wear. Since many drive
failures are due to mechanical issues which are more likely on older drives, this violates
those assumptions and failures are in fact statistically correlated. In practice then, the
chances of a second failure before the first has been recovered is not nearly as unlikely as
might be supposed, and data loss can, in practice, occur at significant rates.[23]
[edit] Atomicity
This is a little understood and rarely mentioned failure mode for redundant storage
systems that do not utilize transactional features. Database researcher Jim Gray wrote
"Update in Place is a Poison Apple"[26] during the early days of relational database
commercialization. However, this warning largely went unheeded and fell by the wayside
upon the advent of RAID, which many software engineers mistook as solving all data
storage integrity and reliability problems. Many software programs update a storage
object "in-place"; that is, they write a new version of the object on to the same disk
addresses as the old version of the object. While the software may also log some delta
information elsewhere, it expects the storage to present "atomic write semantics,"
meaning that the write of the data either occurred in its entirety or did not occur at all.
However, very few storage systems provide support for atomic writes, and even fewer
specify their rate of failure in providing this semantic. Note that during the act of writing
an object, a RAID storage device will usually be writing all redundant copies of the
object in parallel, although overlapped or staggered writes are more common when a
single RAID processor is responsible for multiple drives. Hence an error that occurs
during the process of writing may leave the redundant copies in different states, and
furthermore may leave the copies in neither the old nor the new state. The little known
failure mode is that delta logging relies on the original data being either in the old or the
new state so as to enable backing out the logical change, yet few storage systems provide
an atomic write semantic on a RAID disk.
While the battery-backed write cache may partially solve the problem, it is applicable
only to a power failure scenario.
Since transactional support is not universally present in hardware RAID, many operating
systems include transactional support to protect against data loss during an interrupted
write. Novell Netware, starting with version 3.x, included a transaction tracking system.
Microsoft introduced transaction tracking via the journaling feature in NTFS. Ext4 has
journaling with checksums; ext3 has journaling without checksums but an "append-only"
option, or ext3COW (Copy on Write). If the journal itself in a filesystem is corrupted
though, this can be problematic. The journaling in NetApp WAFL file system gives
atomicity by never updating the data in place, as does ZFS. An alternative method to
journaling is soft updates, which are used in some BSD-derived system's implementation
of UFS.
This can present as a sector read failure. Some RAID implementations protect against this
failure mode by remapping the bad sector, using the redundant data to retrieve a good
copy of the data, and rewriting that good data to the newly mapped replacement sector.
The UBE (Unrecoverable Bit Error) rate is typically specified at 1 bit in 1015 for
enterprise class disk drives (SCSI, FC, SAS) , and 1 bit in 1014 for desktop class disk
drives (IDE/ATA/PATA, SATA). Increasing disk capacities and large RAID 5
redundancy groups have led to an increasing inability to successfully rebuild a RAID
group after a disk failure because an unrecoverable sector is found on the remaining
drives. Double protection schemes such as RAID 6 are attempting to address this issue,
but suffer from a very high write penalty.
The disk system can acknowledge the write operation as soon as the data is in the cache,
not waiting for the data to be physically written. This typically occurs in old, non-
journaled systems such as FAT32, or if the Linux/Unix "writeback" option is chosen
without any protections like the "soft updates" option (to promote I/O speed whilst
trading-away data reliability). A power outage or system hang such as a BSOD can mean
a significant loss of any data queued in such a cache.
Often a battery is protecting the write cache, mostly solving the problem. If a write fails
because of power failure, the controller may complete the pending writes as soon as
restarted. This solution still has potential failure cases: the battery may have worn out, the
power may be off for too long, the disks could be moved to another controller, the
controller itself could fail. Some disk systems provide the capability of testing the battery
periodically, however this leaves the system without a fully charged battery for several
hours.
An additional concern about write cache reliability exists, specifically regarding devices
equipped with a write-back cache—a caching system which reports the data as written as
soon as it is written to cache, as opposed to the non-volatile medium.[27] The safer cache
technique is write-through, which reports transactions as written when they are written to
the non-volatile medium.
With larger disk capacities the odds of a disk failure during rebuild are not negligible. In
that event the difficulty of extracting data from a failed array must be considered. Only
RAID 1 stores all data on each disk. Although it may depend on the controller, some
RAID 1 disks can be read as a single conventional disk. This means a dropped RAID 1
disk, although damaged, can often be reasonably easily recovered using a software
recovery program. If the damage is more severe, data can often be recovered by
professional data recovery specialists. RAID 5 and other striped or distributed arrays
present much more formidable obstacles to data recovery in the event the array fails.
Many modern drives have internal error recovery algorithms that can take upwards of a
minute to recover and re-map data that the drive fails to easily read. Many RAID
controllers will drop a non-responsive drive in 8 seconds or so. This can cause the array
to drop a good drive because it has not been given enough time to complete its internal
error recovery procedure, leaving the rest of the array vulnerable. So-called enterprise
class drives limit the error recovery time and prevent this problem, but desktop drives can
be quite risky for this reason. A fix specific to Western Digital drives used to be known: a
utility called WDTLER.exe could limit the error recovery time of a Western Digital
desktop drive so that it would not be dropped from the array for this reason. The utility
enabled TLER (time limited error recovery) which limits the error recovery time to 7
seconds. As of October 2009 Western Digital has locked out this feature in their desktop
drives such as the Caviar Black.[28] Western Digital enterprise class drives are shipped
from the factory with TLER enabled to prevent being dropped from RAID arrays. Similar
technologies are used by Seagate, Samsung, and Hitachi.
As of late 2010, support for ATA Error Recovery Control configuration has been added
to the Smartmontools program, so it now allows configuring many desktop class hard
drives for use on a RAID controller.[28]
Drive capacity has grown at a much faster rate than transfer speed, and error rates have
only fallen a little in comparison. Therefore, larger capacity drives may take hours, if not
days, to rebuild. The re-build time is also limited if the entire array is still in operation at
reduced capacity.[29] Given a RAID array with only one disk of redundancy (RAIDs 3, 4,
and 5), a second failure would cause complete failure of the array. Even though
individual drives' mean time between failure (MTBF) have increased over time, this
increase has not kept pace with the increased storage capacity of the drives. The time to
rebuild the array after a single disk failure, as well as the chance of a second failure
during a rebuild, have increased over time.[30]
In order to provide the desired protection against physical drive failure, a RAID array
must be properly set up and maintained by an operator with sufficient knowledge of the
chosen RAID configuration, array controller (hardware or software), failure detection and
recovery. Unskilled handling of the array at any stage may exacerbate the consequences
of a failure, and result in downtime and full or partial loss of data that might otherwise be
recoverable.
Particularly, the array must be monitored, and any failures detected and dealt with
promptly. Failure to do so will result in the array continuing to run in a degraded state,
vulnerable to further failures. Ultimately more failures may occur, until the entire array
becomes inoperable, resulting in data loss and downtime. In this case, any protection the
array may provide merely delays this.
The operator must know how to detect failures or verify healthy state of the array,
identify which drive failed, have replacement drives available, and know how to replace a
drive and initiate a rebuild of the array.
While RAID may protect against physical drive failure, the data is still exposed to
operator, software, hardware and virus destruction. Many studies[31] cite operator fault as
the most common source of malfunction, such as a server operator replacing the incorrect
disk in a faulty RAID array, and disabling the system (even temporarily) in the process.
[32]
Most well-designed systems include separate backup systems that hold copies of the
data, but don't allow much interaction with it. Most copy the data and remove the copy
from the computer for safe storage.
[edit] History
Norman Ken Ouchi at IBM was awarded a 1978 U.S. patent 4,092,732[33] titled "System
for recovering data stored in failed memory unit." The claims for this patent describe
what would later be termed RAID 5 with full stripe writes. This 1978 patent also
mentions that disk mirroring or duplexing (what would later be termed RAID 1) and
protection with dedicated parity (that would later be termed RAID 4) were prior art at
that time.
The term RAID was first defined by David A. Patterson, Garth A. Gibson and Randy
Katz at the University of California, Berkeley, in 1987. They studied the possibility of
using two or more drives to appear as a single device to the host system and published a
paper: "A Case for Redundant Arrays of Inexpensive Disks (RAID)" in June 1988 at the
SIGMOD conference.[1]
One of the early uses of RAID 0 and 1 was the Crosfield Electronics Studio 9500 page
layout system based on the Python workstation. The Python workstation was a Crosfield
managed international development using PERQ 3B electronics, benchMark
Technology's Viper display system and Crosfield's own RAID and fibre-optic network
controllers. RAID 0 was particularly important to these workstations as it dramatically
sped up image manipulation for the pre-press markets. Volume production started in
Peterborough, England in early 1987.
[edit] Vinum
Vinum is a logical volume manager, also called Software RAID, allowing
implementations of the RAID-0, RAID-1 and RAID-5 models, both individually and in
combination. Vinum is part of the base distribution of the FreeBSD operating system.
Versions exist for NetBSD, OpenBSD and DragonFly BSD. Vinum source code is
currently maintained in the FreeBSD source tree. Vinum supports raid levels 0, 1, 5, and
JBOD. Vinum is invoked as "gvinum" on FreeBSD version 5.4 and up.
Hardware implementations also typically support hot swapping, allowing failed drives to
be replaced while the system is running. In rare cases hardware controllers have become
faulty, which can result in data loss. Hybrid RAIDs have become very popular with the
introduction of inexpensive hardware RAID controllers. The hardware is a normal disk
controller that has no RAID features, but there is a boot-time application that allows users
to set up RAIDs that are controlled via the BIOS. When any modern operating system is
used, it will need specialized RAID drivers that will make the array look like a single
block device. Since these controllers actually do all calculations in software, not
hardware, they are often called "fakeraids". Unlike software RAID, these "fakeraids"
typically cannot span multiple controllers.
Non-RAID drive architectures also exist, and are often referred to, similarly to RAID, by
standard acronyms, several tongue-in-cheek. A single drive is referred to as a SLED
(Single Large Expensive Drive), by contrast with RAID, while an array of drives without
any additional control (accessed simply as independent drives) is referred to as a JBOD
(Just a Bunch Of Disks). Simple concatenation is referred to a SPAN, or sometimes as
JBOD, though this latter is proscribed in careful use, due to the alternative meaning just
cited.