16/03/2008
It’s official: Western Digital hates me and I hate them too
About a month ago one of the hard disks in my PC started showing DMA errors on syslog. It was a Western Digital WD1200JB with manufacture date: 13 MAR 2002. Luckily on that disk I only kept temporary data like downloads, some music and videos, and some pretty old backups. As soon as I saw the DMA errors on syslog I placed a spare 200Gb drive on the box and tried to rsync all data to it. I saved most of the needed data but I lost some of my old backups. The case is that I didn’t really know what was inside them, there were some directories named like: “/Backups/OLD/foobar/backup_older/random_crap”. I guess it was crap after all. I never needed anything from inside that directories for at least the last couple years.
2 weeks ago I returned from a trip to Athens. I checked my mails where I get reports from ossec on various servers I manage. One of these mails reported that a RAID5 array with 6x200Gb disks was degraded due to a hard disk failure. Yes, it was a Western Digital, again. Model Number:WD2000JB, manufacture date: 26 AUG 2004. I had another 200Gb drive at home where I keep my backups. Since I couldn’t afford the risk of not having a spare disk for my home backups, I bought a Seagate ST3500320AS. Since the new disk was 500Gb I copied all my data from the “spare” 200Gb disk and also made a full backup of my boot disk which is 120Gb. I then replaced the faulty 200Gb on the server with the “spare” 200Gb drive I had at home.
On Thursday I came back from an one-week trip, this time to my hometown. All was fine until Friday noon. Then I tried to open a text file inside my home dir (which is a seperate partition on my boot disk) that I keep some random notes and the machine started crawling. I couldn’t open the file. I tried to copy the file to another disk without success. I only got some beautiful I/O errors on the terminal and DMA errors on the syslog. Guess what! The disk was a Western Digital 1200JB with manufacture date: 14 DEC 2001. Under different circumstances I would cry at my bad luck…but the only thing I could do was laugh. I couldn’t stop laughing about this mess. I placed the 500Gb Seagate on a external USB case and started to rsync the root dir on top of my 2 weeks ago rsync. A couple of files couldn’t be read from the boot disk but they were already on the “backup” so I saved everything. Since I had no spare disk left at home I went out and bought another hard disk. I couldn’t find any 250 or 320Gb Seagate drives so I bought another 500Gb Seagate ST3500320AS. What was funny was that the salesman at the local store tried to convince me to buy a Western Digital 320Gb without success of course, I wonder why…
I placed the new 500Gb disk in my box, booted iloog, partitioned the disk and rsync-ed my data from the “old” 500Gb disk to the new.
YES, I am using smartctl/smartd on all of my boxes even at home. Smartctl was not showing ANY errors at all before the first DMA errors appeared on syslog. I am regularly testing all my disks with smartctl’s tests: short, long and conveyance (where it’s supported)
The first disk is in complete unusable form right now. I tried partitioning it and formatting it but it moans painfully when it is accessed. It currently shows more than 100 S.M.A.R.T. errors. It’s dead.
The second one has about 4-5 S.M.A.R.T. errors logged. It doesn’t make any strange noises when operating but I haven’t extensively tested it yet. It surely cannot be trusted…
The third disk has bad sectors and about 20 S.M.A.R.T. errors. Most of them were “created” during the check for bad blocks process and every time a bad arrea is accessed more errors are added to the log. During operation it makes an annoying sound which is like scratching metal parts against each other.
Funny thing is what smartcl reports for all disks, even for the first one:
SMART overall-health self-assessment test result: PASSED
I am well aware that all disks were over their guarantee (3 years), that’s why I was keeping backups (of important stuff) over separate disks, but I don’t think I’ll be buying any Western Digital drives in the near future…I need some time to get over this month of crashes…
Any other Western Digital haters out there ?
Filed by kargig at 18:02 under General,Linux
9 Comments | 6,241 views