Customers are replacing disk drives at rates far higher than those suggested by the estimated mean time between failure (MTBF) supplied by drive vendors, according to a study of about 100,000 drives conducted by Carnegie Mellon University.
The study, presented last month at the 5th USENIX Conference on File and Storage Technologies in San Jose, also shows no evidence that Fibre Channel (FC) drives are any more reliable than less expensive but slower performing Serial ATA (SATA) drives.
That surprising comparison of FC and SATA reliability could speed the trend away from FC to SATA drives for applications such as near-line storage and backup, where storage capacity and cost are more important than sheer performance, analysts said.
At the same conference, another study of more than 100,000 drives in data centres run by Google indicated that temperature seems to have little effect on drive reliability, even as vendors and customers struggle to keep temperature down in their tightly packed data centres. Together, the results show how little information customers have to predict the reliability of disk drives in actual operating conditions and how to choose among various drive types.
Real world vs. data sheets
The Carnegie Mellon study examined large production systems, including high-performance computing sites and Internet services sites running SCSI, FC and SATA drives. The data sheets for those drives listed MTBF between 1 million to 1.5 million hours, which the study said should mean annual failure rates "of at most 0.88%." However, the study showed typical annual replacement rates of between 2% and 4%, "and up to 13% observed on some systems."
Garth Gibson, associate professor of computer science at Carnegie Mellon and co-author of the study, was careful to point out that the study didn't necessarily track actual drive failures, but cases in which a customer decided a drive had failed and needed replacement. He also said he has no vendor-specific failure information, and that his goal is not "choosing the best and the worst vendors" but to help them to improve drive design and testing.
He echoed storage vendors and analysts in pointing out that as many as half of the drives returned to vendors actually work fine and may have failed for any reason, such as a harsh environment at the customer site and intensive, random read/write operations that cause premature wear to the mechanical components in the drive.
Several drive vendors declined to be interviewed. "The conditions that surround true drive failures are complicated and require a detailed failure analysis to determine what the failure mechanisms were," said a spokesperson for Seagate Technology in Scotts Valley, Calif., in an e-mail. "It is important to not only understand the kind of drive being used, but the system or environment in which it was placed and its workload."
"Regarding various reliability rate questions, it's difficult to provide generalities," said a spokesperson for Hitachi Global Storage Technologies in San Jose, in an e-mail. "We work with each of our customers on an individual basis within their specific environments, and the resulting data is confidential."
Ashish Nadkarni, a principal consultant at GlassHouse Technologies, a storage services provider in Massachusetts, U.S., said he isn't surprised by the comparatively high replacement rates because of the difference between the "clean room" environment in which vendors test and the heat, dust, noise or vibrations in an actual data centre.
He also said he has seen overall drive quality falling over time as the result of price competition in the industry. He urged customers to begin tracking disk drive records "and to make a big noise with the vendor" to force them to review their testing processes.
Read up on the latest ideas and technologies from companies that sell hardware, software and services. Taking On Demand CRM Integration to the Next Level
Email Archiving Implementation: Five Costly Mistakes to Avoid
Data grids and service-oriented architecture
Strategies for Eliminating .PST Files
Best Practice in Building an Integrated Information Management Strategy
CRM your salespeople will love
IT Service Management Needs and Adoption Trends: An Analysis of a Global Survey of IT Executives
Business Intelligence and Enterprise Performance Management: Trends for Emerging Businesses
Zones provide focussed content from Computerworld and leading technology partners.Discover how SOA can create smarter outcomes for your business.
Attend and learn:
- How SOA is helping leading companies to become more agile
- Where you should be applying SOA processes in your company
- The top SOA implementation mistakes to avoid
Click here for more information.
- +
Computerworld Live Podcast #97: The Future of Enterprise Networking 25/07/2008 09:45:36
This week CW Live chats with Mark Thompson, global sales and marketing manager for HP ProCurve, on the future of the enterprise networking. Mark discusses the trends we can expect to see in the near future and how the right infrastructure can ensure your enterprise network is secure. - +
Computerworld Live Podcast #96: Security at the Edge 11/06/2008 09:22:22
CW Live speaks with Amol Mitra, HP ProCurve Director of Marketing for Asia Pacific and Japan. Today's topic: how enterprises are starting to shift away from simply controlling security via server logins, firewalls and moving to more adaptive security frameworks. - +
Data Management Edition #10: Multi-Petascale Systems 02/05/2008 09:12:33
This week we look at sustainability and the development of multicore technologies to build multi-petascale systems. - +
IT Security Edition #11: How to poison the Storm botnet 01/05/2008 08:51:55
This week CW Live presents a case study on how to poison the notorious Storm botnet . Plus we take a look at Cisco's plans for Ironport. - +
IT Security Edition #10: Cyber-battles fought and won 24/04/2008 11:09:47
Vendors bow to end user pressure to improve product security, and we take a look at the latest concepts shaping the cyber-battlefield of the future.
Fortinet November Threatscape Report Shows Calm Before Holiday Storm 2008-12-05 16:00:00+11
Epicor® Cited as an Order Management Solutions Leader by Independent Research Firm 2008-12-05 15:52:00+11
F-Secure: Growth In Internet Crime Calls For Growth In Punishment 2008-12-05 13:00:00+11
International researchers gather in Sydney to preview the clever web 2008-12-05 09:48:00+11
Borderless corporate networks to shift focus to secure content management in Australia in 2009 2008-12-04 16:06:00+11
Best Practice in Building an Integrated Information Management Strategy
Discover the business value that creating an integrated information platform can bring. Learn how to provide consistent, accurate information to all stakeholders within your business network. Integrate vital data from disparate sources and deliver a trusted information foundation. Read on to uncover the stepping-stones to your new information management strategy.












