16/03/2008
It’s official: Western Digital hates me and I hate them too
About a month ago one of the hard disks in my PC started showing DMA errors on syslog. It was a Western Digital WD1200JB with manufacture date: 13 MAR 2002. Luckily on that disk I only kept temporary data like downloads, some music and videos, and some pretty old backups. As soon as I saw the DMA errors on syslog I placed a spare 200Gb drive on the box and tried to rsync all data to it. I saved most of the needed data but I lost some of my old backups. The case is that I didn’t really know what was inside them, there were some directories named like: “/Backups/OLD/foobar/backup_older/random_crap”. I guess it was crap after all. I never needed anything from inside that directories for at least the last couple years.
2 weeks ago I returned from a trip to Athens. I checked my mails where I get reports from ossec on various servers I manage. One of these mails reported that a RAID5 array with 6x200Gb disks was degraded due to a hard disk failure. Yes, it was a Western Digital, again. Model Number:WD2000JB, manufacture date: 26 AUG 2004. I had another 200Gb drive at home where I keep my backups. Since I couldn’t afford the risk of not having a spare disk for my home backups, I bought a Seagate ST3500320AS. Since the new disk was 500Gb I copied all my data from the “spare” 200Gb disk and also made a full backup of my boot disk which is 120Gb. I then replaced the faulty 200Gb on the server with the “spare” 200Gb drive I had at home.
On Thursday I came back from an one-week trip, this time to my hometown. All was fine until Friday noon. Then I tried to open a text file inside my home dir (which is a seperate partition on my boot disk) that I keep some random notes and the machine started crawling. I couldn’t open the file. I tried to copy the file to another disk without success. I only got some beautiful I/O errors on the terminal and DMA errors on the syslog. Guess what! The disk was a Western Digital 1200JB with manufacture date: 14 DEC 2001. Under different circumstances I would cry at my bad luck…but the only thing I could do was laugh. I couldn’t stop laughing about this mess. I placed the 500Gb Seagate on a external USB case and started to rsync the root dir on top of my 2 weeks ago rsync. A couple of files couldn’t be read from the boot disk but they were already on the “backup” so I saved everything. Since I had no spare disk left at home I went out and bought another hard disk. I couldn’t find any 250 or 320Gb Seagate drives so I bought another 500Gb Seagate ST3500320AS. What was funny was that the salesman at the local store tried to convince me to buy a Western Digital 320Gb without success of course, I wonder why…
I placed the new 500Gb disk in my box, booted iloog, partitioned the disk and rsync-ed my data from the “old” 500Gb disk to the new.
YES, I am using smartctl/smartd on all of my boxes even at home. Smartctl was not showing ANY errors at all before the first DMA errors appeared on syslog. I am regularly testing all my disks with smartctl’s tests: short, long and conveyance (where it’s supported)
The first disk is in complete unusable form right now. I tried partitioning it and formatting it but it moans painfully when it is accessed. It currently shows more than 100 S.M.A.R.T. errors. It’s dead.
The second one has about 4-5 S.M.A.R.T. errors logged. It doesn’t make any strange noises when operating but I haven’t extensively tested it yet. It surely cannot be trusted…
The third disk has bad sectors and about 20 S.M.A.R.T. errors. Most of them were “created” during the check for bad blocks process and every time a bad arrea is accessed more errors are added to the log. During operation it makes an annoying sound which is like scratching metal parts against each other.
Funny thing is what smartcl reports for all disks, even for the first one:
SMART overall-health self-assessment test result: PASSED
I am well aware that all disks were over their guarantee (3 years), that’s why I was keeping backups (of important stuff) over separate disks, but I don’t think I’ll be buying any Western Digital drives in the near future…I need some time to get over this month of crashes…
Any other Western Digital haters out there ?
Filed by kargig at 18:02 under General,Linux
9 Comments | 6,272 views
Well these disks weren’t only past their guarantee but also beyond their economic life expectancy (which is 5 yrs). I work with lots of harddisk and I must say that both Seagate and Samsung have far more problems then WD.
Have had nothing but problems in the past with WD. I was recently building a new box and was reading up on drives and at the time every one seemed to be saying WD was a head of seagate. There was also a lot of people getting DOA Segates. I said against my best judgment I’ll give WD another try. Six months after I got my drive it died on me. Luckily smart caught it for me and I was able to backup the drive to a newly bought seagate. I will never try WD again at this point.
I fortunately have to cope with way less disks than you, but I had disks die on me, made by all manufacturers. I have accepted the fact that sooner or later all disks will die, so I mostly buy those with the longest guarantee (seagate – 5 years, or the RAID Edition WDs) so that if the failure happens, I would at least have more possibilities get another back.
I was particulary delighted earlier this month when I recieved a 300GB Seagate for the malfuntioning 200GB model I sent them 🙂
I have dead disks of all vendors and all kinds (IDE, ATA, various SCSI, …). Regardless of the vendor that you choose to hate, you should read this musing from Rik Farrow and the accompanying paper.
Thanks for your comments.
@aniruddha: I am aware of the normal “life expectancy” of disks, I just found it odd that 3 WD drives that I own which had different manufacture dates, crashed within 1 month. I have heard many many complaints from people having Seagate and Maxtor disks in the past.
@gregf: I think that having a DOA disk is much much better than having it crash after 6 months of use…
@stavrosg: I have made peace with myself by acknowledging the fact that disks died, die, and will die when you least expect it. That’s why I always keep spare disks at home, not connected to any boxes, that I regularly place backups on them. That’s what actually saved me during this month.
@adamo: I have read the paper in the past. That’s why when people tell me that they keep their files on a RAID5 for backup I try to explain to them why this is a bad move. It gets even worse if you consider that disks produced at the same time, with not so different serial numbers, are prone to crash at about the same time. So that failure percentage (20%) gets even bigger…When I build RAID5 arrays I tend to buy the same disks from different shops so there’s a “slightly” less chance of buying disks produced at the same date, and as a result there’s a “slightly” less chance of 2-3 disks dieing at the same time. Another great paper to read on disk reliability was published from Google Labs: Failure Trends in a Large Disk Drive Population
I beg to differ.I have been usin WD disks almost exclusively for decades… all the way back to early 1980’s involving dozens of drives. Sure, I had an occasional Seagate or Maxtor, but in all that time I hav had only two actual hard disk failures. Maybe there was some weakness on the 1200JD’s. Maybe it was just random bad luck, since the manufacture dates were all in different years. Keep in mind- there are reasons why disks fail that are EX-trinsic. I once had two failures that were due to a bad PSU. Once that was replaced I had no more problems. I now have 6 each of WD’s 1T and 2T green drives, along with a few smaller ones, and I’m happy with them.
wd is sheap … mine is brocken agen
i\I have used everything from HDD Regenerator, try build my own program to fix it, and repair it physically(as last resort) an whatever I try it still doesn’t work. I searched for a solution for about a month and no results. It’s like it has a mind of it’s own and they built it to fail.
It’s the Windows ME of Hard Drives
I hate Western Digital……..it is the top 1 in my HATE LIST.
I have a Seagate and 2 WD hdds. Seagate worked 7 years without a complaint, but both the WD HDD are not even 1 yr in lifespan and 1WD went for replacement 2 time and the other 1 time.
I loove Seagate. and I hate WD
I HATE WESTERN DIGITAL
I HATE WESTERN DIGITAL
I HATE WESTERN DIGITAL
WD HDD i sent for replacement 2 times is 250Gb and expect the data loss I incurred.
I HATE WESTERN DIGITAL