An Evaluation of Redundant Arrays of Disks using an Amdahl 5890
Peter M. Chen, Garth A. Gibson, Randy H. Katz, David A. Patterson Computer Science Division, University of California, Berkeley
Abstract. Recently we presented several disk array architectures designed to increase the data rate and I/O rate of supercomputing applications, transaction processing, and file systems [Patterson 88]. In this paper we present a hardware performance measurement of two of these architectures, mirroring and rotated parity. We see how throughput for these two architectures is affected by response time requirements, request sizes, and read to write ratios. We find that for applications with large accesses, such as many supercomputing applications, a rotated parity disk array far outperforms traditional mirroring architecture. For applications dominated by small accesses, such as transaction processing, mirroring architectures have higher performance per disk than rotated parity architectures.
1. The I/O Crisis
Over the past decade, processing speed, memory speed,
memory capacity, and disk capacity have all grown tremendously:
g Single chip processors have increased in speed at the
rate of 40%-100% per year [Bell 84, Joy 85].
g Caches have increased in speed 40% to 100% per year. g Main memory has quadrupled in capacity every two or three years [Moore 75, Myers 86].
In contrast, disk access times have undergone only modest performance improvements. For example, seek time has improved only about 7% per year [Harker 81]. This imbalanced system growth has already led to many I/O bound supercomputer applications [Kim 87]. If the imbalance is not remedied, Amdahl's Law tells us that much of the astounding processor speedup and memory growth will be wasted [Amdahl 67]. Continued improvement in system performance depends in a large part on I/O systems with higher data and I/O rates.
One way to increase I/O performance is by using an array of many disks [Kim 86, Salem 86]. By using many disks, both throughput (MB per second) and I/O rate (I/O's per second) can be increased. Throughput can be increased by having many disks cooperate in transferring one block of information; the I/O rate can be increased by having multiple independent disks service multiple independent requests. With multiple disks, however, comes lower reliability. According to the commonly used exponential model for disk failures [Schulze 88], 100 disks have a combined failure rate of 100 times the failure rate of a single disk. If every disk failure caused data loss, a 100 disk array would lost data every few hundred hours. This is intolerable for a supposedly stable storage system. To protect against data loss in the face of a single disk failure, some sort of data redundancy must exist.
This paper analyzes the performance of two disk array redundancy schemes. The performance analysis is based on a set of experiments carried out an Amdahl mainframe and disk system. In these experiments, we explore several issues:
g What are the basic differences in throughput and response time between the various redundancy schemes? g How does changing the size of an I/O request affect performance of the redundancy schemes?
g How does changing the read/write ratio affect performance
of the redundancy schemes?
We begin by describing two redundant array organizations and a simple performance model comparing them. We then present the hardware environment, the experiment workload design, the metrics for our results, and the performance results.
We start by running the same workload as the RAID paper [Patterson 88]: request sizes are full stripe or individual blocks; requests are all reads or all writes. In Section 7.1, we analyze an idle system to break down the response time for a basic I/O. Next, in Section 7.2, we attempt to duplicate the assumptions made in the RAID paper of unlimited response time by analyzing a saturated system. In 7.3, we make the experiment more variable by controlling and equalizing the response times.
When we have finished analyzing the RAID paper workload, we begin to use more varied workloads. We again make this transition by changing one aspect of the workload at a time. In 8.1, we remove the assumption of constant request size by using a distribution of request sizes. As our last step in making the experiment more realistic, we allow workloads with both reads and writes.
2. Introduction to Redundant Arrays of Disks
In "A Case for Redundant Arrays of Inexpensive Disks (RAID)", henceforth referred to as "The RAID paper" [Patterson 88], Patterson, Gibson, and Katz present five ways to introduce redundancy into an array of disks: RAID Level 1 through RAID Level 5. Using a simple performance model of these five organizations, they conclude that RAID Level 1, mirrored disks, and RAID Level 5, rotated parity, have the best performance potential. This paper focuses on these two RAID Levels, plus the basic non-redundant RAID Level 0, added to provide a basis of comparison between RAID Levels 1 and 5. Figure 1 shows the data layout in the three redundancy schemes. The rest of this section summarizes the RAID Levels--see [Patterson 88] for more details. In all organizations, data are interleaved across all disks [Kim 86, Salem 86]. We define a stripe of data to be one unit of interleaving from each disk. For example, the first stripe of data in Figure 1 consists of logical blocks 0, 1, 2, and 3. The storage efficiency, a measure of the capacity cost of redundancy, is defined to be the effective (user) data capacity divided by the total disk capacity. For RAID Level 0, the effective data capacity equals the total disk capacity, so the storage efficiency is 100%. RAID Level 1, mirrored disks, is a traditional way to incorporate redundancy in an array of disks [Bitton 88]. In RAID Level 1, each datum is kept on two distinct disks: a data disk and a shadow disk. Thus, for RAID Level 1, the effective storage capacity is half the total disk capacity and the storage efficiency is 50%. Reads can be serviced by either the data disk or the shadow disk, but, to maintain consistency, writes must be serviced by both data and shadow disk.
RAID Level 5, rotated parity, incorporates redundancy by maintaining parity across all disks. For example, P0 in Figure 1c is the parity of logical blocks 0, 1, 2, and 3. Parity will have to be updated whenever data is written. If all parity information was kept on one disk, this disk would see many more requests than any data disk. To avoid a bottleneck in accessing the parity information, it is spread over all disks. There are many ways to spread this parity information across disks, but this is not within the scope of this paper. Instead, we have chosen one mapping of