David A. Patterson led the team at the University of California, Berkeley, that developed the idea of RAID storage. In an interview with Frank Hayes, Patterson recalled the beginnings of his RAID project in 1987.
"We had just been working on RISC processors, and we consciously said, 'Processors are going to start getting fast, improving faster than they have in the past. So what are we going to do about I/O?' That was one motivation.
"The other one was that Randy Katz (one of Patterson's colleagues at Berkeley) got a Macintosh, and it had a hard disk in a separate box next to it. And he said, 'That's kind of interesting; here's a much smaller disk than I'm used to. What could we do with that as a building block?'
"So we held a graduate course where we started off with some rough ideas, and then we and the graduate students -- Garth Gibson, Pete Chen, Ed Lee, Ann Chevernak, Ethan Miller -- met and talked and read papers, and the ideas evolved from there.
"But when we tried to tell people our ideas, they couldn't understand. They'd say, 'Oh yeah, that's the same thing that IBM's been doing forever in terms of mirroring.' Or, 'Oh yeah, Thinking Machines, they've got a product in this area.' And so when we tried to explain things, they assumed what we'd done had already been subsumed by other work.
"That motivated us to write a paper ('The Case for Redundant Arrays of Inexpensive Disks'). It advocated that we should be replacing these big disks by lots of small disks. Basically, a big, relatively thick disk that has to spin fast is much less efficient than lots of small disks, and we get all these benefits in terms of volume and footprint and power. We submitted the paper to the database conference SIGMOD, and Garth Gibson (the lead graduate student on the project) and I went to a short course that was given at Santa Clara University by Al Hoagland, who was kind of the godfather of the disk industry. We came with 20 or 30 copies of our report and handed it out at that meeting, and that was a good thing to do. The paper just clicked. It was a good time, I guess, for that set of arguments.
"We built the RAID I (in 1989) to try the ideas in software. For RAID II (in 1993), we said, 'Let's try to build a high-performance I/O system that connects over a network.' Then at the end of the project, we had a little demo where we pulled the disk out and the thing kept working.
"We were still performance-oriented, thinking RAID was for performance, so we were shocked to see somebody write this up in Byte magazine. The PC community was obviously not so performance-oriented as it was dependability-oriented, and they thought, Hey, less-expensive dependable computing.
"It really just took off after that. EMC (Corp.) decided to build mainframe storage out of PC disks. Compaq (Computer Corp.) had RAID early, and Data General (Corp.). And of course IBM had its own. We didn't know IBM had its own RAID 5 set of ideas in the AS/400 line. IBM had completely independently done the same RAID part of the ideas but used large disks.
"One of the surprises about RAID was it was so expensive. The I in the name when we coined the term was for inexpensive disks. But the system was so expensive, that was kind of awkward for marketing people. So Randy blessed the change to independent for I. Since the RAID boxes weren't cheap, that was probably a better name.
"The current project I'm working on is ROC, Recovery-Oriented Computing. With the RAID stuff, we were always thinking performance, but obviously, dependability is the reason people are doing it. People get mad if their program crashes, but they just go berserk if they lose data. The ROC philosophy is recovering fast when outages happen. That's a different engineering ethic. Hardware will break, software has bugs, people will make mistakes. And if you believe that, then it makes sense to recover fast, rather than just try to make things that never break."