Reference and Sources:
- The most part of text in this guide has been taken from copyrighted document of Adaptec, Inc. on site (www.adaptec.com)
- Perceptive Solutions, Inc.
RAID stands for Redundant Array of
Inexpensive (or sometimes "Independent") Disks.
RAID is a method of combining several hard disk drives into one
logical unit (two or more disks grouped together to appear as a
single device to the host system). RAID technology was developed to
address the fault-tolerance and performance limitations of
conventional disk storage. It can offer fault tolerance and higher
throughput levels than a single hard drive or group of independent
hard drives. While arrays were once considered complex and
relatively specialized storage solutions, today they are easy to
use and essential for a broad spectrum of client/server
applications.
History
RAID technology was first defined by a group of computer
scientists at the University of California at Berkeley in 1987. The
scientists studied the possibility of using two or more disks to
appear as a single device to the host system.
Although the array's performance was better than that of large,
single-disk storage systems, reliability was unacceptably low. To
address this, the scientists proposed redundant architectures to
provide ways of achieving storage fault tolerance. In addition to
defining RAID levels 1 through 5, the scientists also studied data
striping -- a non-redundant array configuration that distributes
files across multiple disks in an array. Often known as RAID 0,
this configuration actually provides no data protection. However,
it does offer maximum throughput for some data-intensive
applications such as desktop digital video production.
The driving factors behind RAID
A number of factors are responsible for the growing adoption of
arrays for critical network storage.
More and more organizations have created enterprise-wide networks
to improve productivity and streamline information flow. While the
distributed data stored on network servers provides substantial
cost benefits, these savings can be quickly offset if information
is frequently lost or becomes inaccessible. As today's applications
create larger files, network storage needs have increased
proportionately. In addition, accelerating CPU speeds have
outstripped data transfer rates to storage media, creating
bottlenecks in today's systems.
RAID storage solutions overcome these challenges by providing a
combination of outstanding data availability, extraordinary and
highly scalable performance, high capacity, and recovery with no
loss of data or interruption of user access.
By integrating multiple drives into a single array -- which is
viewed by the network operating system as a single disk drive --
organizations can create cost-effective, minicomputersized
solutions of up to a terabyte or more of storage.
RAID Levels
There are several different RAID "levels" or redundancy schemes, each with inherent cost, performance, and availability (fault-tolerance) characteristics designed to meet different storage needs. No individual RAID level is inherently superior to any other. Each of the five array architectures is well-suited for certain types of applications and computing environments. For client/server applications, storage systems based on RAID levels 1, 0/1, and 5 have been the most widely used. This is because popular NOSs such as Windows NT® Server and NetWare manage data in ways similar to how these RAID architectures perform.
RAID 0
Data striping without redundancy (no protection).
- Minimum number of drives: 2
- Strengths: Highest performance.
- Weaknesses: No data protection; One drive fails, all data is lost.
DRIVE 1 | DRIVE 2 |
---|---|
Data A | Data B |
Data C | Data D |
Data E | Data F |
RAID 1
Disk mirroring.
- Minimum number of drives: 2
- Strengths: Very high performance; Very high data protection; Very minimal penalty on write performance.
- Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is required.
|
|
RAID 2
No practical use.
- Minimum number of drives: Not used in LAN
- Strengths: Previously used for RAM error environments correction (known as Hamming Code ) and in disk drives before he use of embedded error correction.
- Weaknesses: No practical use; Same performance can be achieved by RAID 3 at lower cost.
RAID 3
Byte-level data striping with dedicated parity drive.
- Minimum number of drives: 3
- Strengths: Excellent performance for large, sequential data requests.
- Weaknesses: Not well-suited for transaction-oriented network applications; Single parity drive does not support multiple, simultaneous read and write requests.
RAID 4
Block-level data striping with dedicated parity drive.
- Minimum number of drives: 3 (Not widely used)
- Strengths: Data striping supports multiple simultaneous read requests.
- Weaknesses: Write requests suffer from same single parity-drive bottleneck as RAID 3; RAID 5 offers equal data protection and better performance at same cost.
RAID 5
Block-level data striping with distributed parity.
- Minimum number of drives: 3
- Strengths: Best cost/performance for transaction-oriented networks; Very high performance, very high data protection; Supports multiple simultaneous reads and writes; Can also be optimized for large, sequential requests.
- Weaknesses: Write performance is slower than RAID 0 or RAID 1.
DRIVE 1 | DRIVE 2 | DRIVE 3 |
---|---|---|
Parity A | Data A | Data A |
Data B | Parity B | Data B |
Data C | Data C | Parity C |
RAID 01 (0+1) and RAID 10 (1+0)
Combination of RAID 0 (data striping) and RAID 1 (mirroring). RAID 01 (0+1) is a mirrored configuration of two striped sets (mirror of stripes); RAID 10 (1+0) is a stripe across a number of mirrored sets(stripe of mirrors). RAID 10 provides better fault tolerance and rebuild performance than RAID 01. Both array types provide very good to excellent overall performance by combining the speed of RAID 0 with the redundancy of RAID 1 without requiring parity calculations.
- Minimum number of drives: 4
- Strengths: Highest performance, highest data protection (can tolerate multiple drive failures).
- Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is required; Requires minimum of four drives.
RAID 01 (0+1 mirror of stripes) | |||
---|---|---|---|
DRIVE 1 | DRIVE 2 | DRIVE 3 | DRIVE 4 |
Data A | Data A | mA | mA |
Data B | Data B | mB | mB |
Data C | Data C | mC | mC |
Original Data | Original Data | Mirrored Data | Mirrored Data |
RAID 10 (1+0 stripe of mirrors) | |||
---|---|---|---|
DRIVE 1 | DRIVE 2 | DRIVE 3 | DRIVE 4 |
Data A | mA | Data B | mB |
Data C | mC | Data D | mD |
Data E | mE | Data F | mF |
Original Data | Mirrored Data | Original Data | Mirrored Data |
Types Of RAID
There are three primary array implementations: software-based arrays, bus-based array adapters/controllers, and subsystem-based external array controllers. As with the various RAID levels, no one implementation is clearly better than another -- although software-based arrays are rapidly losing favor as high-performance, low-cost array adapters become increasingly available. Each array solution meets different server and network requirements, depending on the number of users, applications, and storage requirements.
It is important to note that all RAID code is based on software. The difference among the solutions is where that software code is executed -- on the host CPU (software-based arrays) or offloaded to an on-board processor (bus-based and external array controllers).
Description | Advantages | |
---|---|---|
Software-based RAID | Primarily used with entry-level servers, software-based arrays rely on a standard host adapter and execute all I/O commands and mathematically intensive RAID algorithms in the host server CPU. This can slow system performance by increasing host PCI bus traffic, CPU utilization, and CPU interrupts. Some NOSs such as NetWare and Windows NT include embedded RAID software. The chief advantage of this embedded RAID software has been its lower cost compared to higher-priced RAID alternatives. However, this advantage is disappearing with the advent of lower-cost, bus-based array adapters. |
|
Hardware-based RAID | Unlike software-based arrays, bus-based array
adapters/controllers plug into a host bus slot [typically a 133
MByte (MB)/sec PCI bus] and offload some or all of the I/O commands
and RAID operations to one or more secondary processors. Originally
used only with mid- to high-end servers due to cost, lower-cost
bus-based array adapters are now available specifically for
entry-level server network applications. In addition to offering the fault-tolerant benefits of RAID, bus-based array adapters/controllers perform connectivity functions that are similar to standard host adapters. By residing directly on a host PCI bus, they provide the highest performance of all array types. Bus-based arrays also deliver more robust fault-tolerant features than embedded NOS RAID software. As newer, high-end technologies such as Fibre Channel become readily available, the performance advantage of bus-based arrays compared to external array controller solutions may diminish. |
|
External Hardware RAID Card | Intelligent external array controllers "bridge" between one or
more server I/O interfaces and single- or multiple-device channels.
These controllers feature an on-board microprocessor, which
provides high performance and handles functions such as executing
RAID software code and supporting data caching. External array controllers offer complete operating system independence, the highest availability, and the ability to scale storage to extraordinarily large capacities (up to a terabyte and beyond). These controllers are usually installed in networks of stand alone Intel-based and UNIX-based servers as well as clustered server environments. |
|
Server Technology Comparison
UDMA | SCSI | Fibre Channel | |
---|---|---|---|
Best Suited For | Low-cost entry level server with limited expandability | Low to high-end server when scalability is desired | Server-to-Server campus networks |
Advantages |
|
|
|
Parity
The concept behind RAID is relatively simple. The fundamental premise is to be able to recover data on-line in the event of a disk failure by using a form of redundancy called parity. In its simplest form, parity is an addition of all the drives used in an array. Recovery from a drive failure is achieved by reading the remaining good data and checking it against parity data stored by the array. Parity is used by RAID levels 2, 3, 4, and 5. RAID 1 does not use parity because all data is completely duplicated (mirrored). RAID 0, used only to increase performance, offers no data redundancy at all.
A + B + C + D = PARITY
|