page 1  (10 pages)
2to next section

Performance Consequences of

Parity Placement in Disk Arrays

Edward K. Lee and Randy H. Katz

Computer Science Division

Electrical Engineering and Computer Science

University of California at Berkeley

Berkeley, California 94720

Abstract

Due to recent advances in CPU and memory system performance, I/O systems are increasingly limiting the performance of modern computer systems. Redundant Arrays of Inexpensive Disks (RAID) have been proposed by Patterson et. al. [10] to meet the impending I/O crisis. RAIDs substitute many small inexpensive disks for a few large expensive disks to provide higher performance (both transfer rate and I/O rate), smaller footprints and lower power consumption at a lower cost than the large expensive disks they replace. Unfortunately, with so many small disks, media availability becomes a serious problem. RAIDs provide high availability by using parity encoding of data to survive disk failures. As will be shown by this paper, the way parity is distributed in a RAID has significant consequences for performance. In particular, we show that for relatively large request sizes of hundreds of kilobytes, the choice of parity placement significantly affects performance (up to 20-30 percent for the typical disk array configurations that are common today) and propose properties that are generally desirable of parity placements.

1 Motivation

In recent years, improvements in CPU and memory system performance has greatly outpaced improvements in I/O performance. If the trend continues, future improvements in CPU and memory system performance will be wasted as computer systems become increasingly I/O bound. To overcome the impending I/O crisis, Pat-

terson et. al. have proposed Redundant Arrays of Inexpensive Disks (RAID) [3,9,10].

RAIDs substitute many small inexpensive disks for a few large expensive disks to provide higher performance (both transfer rate and I/O rate), smaller footprints and lower power consumption at a lower cost than the large expensive disks they replace. Unfortunately, with so many small disks, media availability becomes a serious problem. RAIDs provide high availability by using parity encoding of data to survive disk failures. Patterson [10] and Chen [1] define six different RAID organizations:

RAID level Non-redundant disk array. Only data
striping is supported.

RAID level 1 Mirrored disk array. Data is duplicated for reliability.

RAID level 2 Hamming-coded disk array.

RAID level 3 Parity-protected disk array with byteinterleaved data. Reads access all disks except the parity disk and writes access all disks. Only one I/O request may be serviced per parity disk at a time.

RAID level 4 Parity-protected disk array with blockinterleaved data. Small reads access a single data disk and small writes access a data disk and a parity disk. Several reads and a single write per parity disk may be serviced concurrently.

RAID level 5 Parity-protected disk array with blockinterleaved data and distributed parity. Similar to RAID level 4 except that the parity is distributed across all disks. Several reads and writes per parity disk may be serviced concurrently.

This paper will investigate the performance implications of different ways of distributing parity (parity placements) in RAID level 5 disk arrays.