How to better use new technologies is one of the most debated topics in the storage world, which is not surprising because often a new technology opens doors that were previously locked and forces potential users to walk away from the familiar beaten path.
Take, for example, desktop-grade SATA (serial ATA) drives. Many people are understandably confused by these drives that seem to have grown too big for their breeches and dare to invade the sacred sphere of enterprise deployment.
RAID solutions that deploy low-cost SATA drives are becoming quite common, although it's fair to say that similar configurations based on parallel ATA drives have never been that popular.
However, SATA deployments in that context have not always been smooth, according to Hubbert Smith, director of marketing for enterprise products at Western Digital.
Customers were noticing some erratic behavior when deploying Western Digital's 250GB Caviar drives in demanding RAID environments, Smith explains. That erratic behavior included drives that, when under heavy load, stopped responding, or drives that spent too much time on error recovery.
Similar incidents are bad enough when they affect single drives, but on RAID configurations those errors can spell disaster. For example, if a drive remains uncommunicative for a long time, its RAID controller can misinterpret that busy status as an unrecoverable disk failure and (assuming that a spare drive is available) may start rebuilding the array.
If this doesn't sound too bad, let me remind you that rebuilding a 250GB drive in RAID 5 can take several days (yes, several 24-hour cycles). Concurrent data access is still possible during rebuild, but if another drive fails or becomes unresponsive -- a strong possibility given that drives are working overtime -- you can kiss your data goodbye for good.
Considering that quite possibly nothing was actually wrong to start with, calling this behavior a catastrophe is perhaps too mellow a euphemism.
Should we take this as a lesson to never deploy those drives in a RAID array? Not necessarily, according to Smith. Western Digital engineers came up with 26 improvements that the company consolidated into a new product line named Caviar RE (RAID Edition).
The most apparent improvement of the new line is that the mean time between failures is now a healthy 1 million hours, which should drastically reduce the possibility of errors, and therefore the time spent on error recovery.
In addition, a busy Caviar RE drive won't just fall off the RAID controller screen, but will maintain communication, which should eliminate the main cause for "apparent" drive failure and unnecessary rebuilds.
Western Digital suggests that the Caviar RE drives, which cost only a few dollars more and are available in several capacities both with SATA and EIDE (Enhanced IDE) interfaces, are a good compromise between cost and reliability for applications that are not business-critical, such as video surveillance, backups, and file and e-mail servers.
In today's sophisticated storage world, those names may mean different things to different customers, but one thing is for sure: SATA drives have now acquired another possible classification layer that could confuse the unsuspecting.