| ![]() |
An Evaluation of Redundant Arrays of Disks using an Amdahl 5890
Peter M. Chen, Garth A. Gibson, Randy H. Katz, David A. Patterson Computer Science Division, University of California, Berkeley
Abstract. Recently we presented several disk array architectures designed to increase the data rate and I/O rate of supercomputing applications, transaction processing, and file systems [Patterson 88]. In this paper we present a hardware performance measurement of two of these architectures, mirroring and rotated parity. We see how throughput for these two architectures is affected by response time requirements, request sizes, and read to write ratios. We find that for applications with large accesses, such as many supercomputing applications, a rotated parity disk array far outperforms traditional mirroring architecture. For applications dominated by small accesses, such as transaction processing, mirroring architectures have higher performance per disk than rotated parity architectures.
1. The I/O Crisis
Over the past decade, processing speed, memory speed,
memory capacity, and disk capacity have all grown tremendously:
g Single chip processors have increased in speed at the
rate of 40%-100% per year [Bell 84, Joy 85].
g Caches have increased in speed 40% to 100% per year.
g Main memory has quadrupled in capacity every two or
three years [Moore 75, Myers 86].
In contrast, disk access times have undergone only modest performance
improvements. For example, seek time has improved only
about 7% per year [Harker 81]. This imbalanced system growth
has already led to many I/O bound supercomputer applications
[Kim 87]. If the imbalance is not remedied, Amdahl's Law tells
us that much of the astounding processor speedup and memory
growth will be wasted [Amdahl 67]. Continued improvement in
system performance depends in a large part on I/O systems with
higher data and I/O rates.
One way to increase I/O performance is by using an array
of many disks [Kim 86, Salem 86]. By using many disks, both
throughput (MB per second) and I/O rate (I/O's per second) can
be increased. Throughput can be increased by having many disks
cooperate in transferring one block of information; the I/O rate
can be increased by having multiple independent disks service
multiple independent requests. With multiple disks, however,
comes lower reliability. According to the commonly used
exponential model for disk failures [Schulze 88], 100 disks have a
combined failure rate of 100 times the failure rate of a single disk.
If every disk failure caused data loss, a 100 disk array would lost
data every few hundred hours. This is intolerable for a supposedly
stable storage system. To protect against data loss in the
face of a single disk failure, some sort of data redundancy must
exist.
This paper analyzes the performance of two disk array
redundancy schemes. The performance analysis is based on a set
of experiments carried out an Amdahl mainframe and disk system.
In these experiments, we explore several issues:
g What are the basic differences in throughput and
response time between the various redundancy schemes?
g How does changing the size of an I/O request affect performance
of the redundancy schemes?
g How does changing the read/write ratio affect performance
of the redundancy schemes?
We begin by describing two redundant array organizations and a
simple performance model comparing them. We then present the
hardware environment, the experiment workload design, the
metrics for our results, and the performance results.
We start by running the same workload as the RAID paper
[Patterson 88]: request sizes are full stripe or individual blocks;
requests are all reads or all writes. In Section 7.1, we analyze an
idle system to break down the response time for a basic I/O.
Next, in Section 7.2, we attempt to duplicate the assumptions
made in the RAID paper of unlimited response time by analyzing
a saturated system. In 7.3, we make the experiment more variable
by controlling and equalizing the response times.
When we have finished analyzing the RAID paper workload,
we begin to use more varied workloads. We again make this
transition by changing one aspect of the workload at a time. In
8.1, we remove the assumption of constant request size by using a
distribution of request sizes. As our last step in making the experiment
more realistic, we allow workloads with both reads and
writes.
2. Introduction to Redundant Arrays of Disks
In "A Case for Redundant Arrays of Inexpensive Disks
(RAID)", henceforth referred to as "The RAID paper" [Patterson
88], Patterson, Gibson, and Katz present five ways to introduce
redundancy into an array of disks: RAID Level 1 through RAID
Level 5. Using a simple performance model of these five organizations,
they conclude that RAID Level 1, mirrored disks, and
RAID Level 5, rotated parity, have the best performance potential.
This paper focuses on these two RAID Levels, plus the basic
non-redundant RAID Level 0, added to provide a basis of comparison
between RAID Levels 1 and 5. Figure 1 shows the data
layout in the three redundancy schemes. The rest of this section
summarizes the RAID Levels--see [Patterson 88] for more details.
In all organizations, data are interleaved across all disks
[Kim 86, Salem 86]. We define a stripe of data to be one unit of
interleaving from each disk. For example, the first stripe of data
in Figure 1 consists of logical blocks 0, 1, 2, and 3. The storage
efficiency, a measure of the capacity cost of redundancy, is defined
to be the effective (user) data capacity divided by the total disk
capacity. For RAID Level 0, the effective data capacity equals
the total disk capacity, so the storage efficiency is 100%.
RAID Level 1, mirrored disks, is a traditional way to incorporate
redundancy in an array of disks [Bitton 88]. In RAID
Level 1, each datum is kept on two distinct disks: a data disk and
a shadow disk. Thus, for RAID Level 1, the effective storage
capacity is half the total disk capacity and the storage efficiency is
50%. Reads can be serviced by either the data disk or the shadow
disk, but, to maintain consistency, writes must be serviced by both
data and shadow disk.
RAID Level 5, rotated parity, incorporates redundancy by
maintaining parity across all disks. For example, P0 in Figure 1c
is the parity of logical blocks 0, 1, 2, and 3. Parity will have to be
updated whenever data is written. If all parity information was
kept on one disk, this disk would see many more requests than
any data disk. To avoid a bottleneck in accessing the parity information,
it is spread over all disks. There are many ways to spread
this parity information across disks, but this is not within the
scope of this paper. Instead, we have chosen one mapping of