OT: HDD having problems spinning up

Silencing hard drives, optical drives and other storage devices

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

Post Reply
Ralf Hutter
SPCR Reviewer
Posts: 8636
Joined: Sat Nov 23, 2002 6:33 am
Location: Sunny SoCal

Post by Ralf Hutter » Fri Sep 17, 2004 1:08 pm

1) BACK UP ANY AND ALL DATA THAT YOU CONSIDER IMPORTANT!!!

2) Download and run Seagate's HDD diagnostic utility from a floppy disk. This will tell you if your drive is getting ready to die. IMHO, things don't sound good.

3) PSU voltages look OK. They're within the ±5% ATX spec. If the Disk Diagnostics show a healthy drive, try swapping our the PSU.

lenny
Patron of SPCR
Posts: 1642
Joined: Wed May 28, 2003 10:50 am
Location: Somewhere out there

Post by lenny » Fri Sep 17, 2004 2:18 pm

Ralf Hutter wrote:1) BACK UP ANY AND ALL DATA THAT YOU CONSIDER IMPORTANT!!!
Cannot agree more.

Incidentally, don't hard drives use 12V for the motor?

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Fri Sep 17, 2004 2:25 pm

If you have a consistent fault that the drive only does not power up when all the HDDs are connected, then it does sound like a PSU issue.

Its probably the drive from what you say, or a dodgy molex connection (or whatever the SATA equivalent is). You say that you swapped the data cables, I would also swap the power cables.

I have recently had HDD problems where one of my drives seemed to strangely power down, and then instantly try to power back up again. This was even though there were no power saving options selected. Turns out that it was the ATA controller on the mobo.

I think it may be a case of trial and error (plus the suggestions from the others) until you find out exactly whats going on :(

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Sun Sep 19, 2004 3:58 am

You may not like me for saying this, but losing the RAID array will help diagnosis. Each drive can then be attached by itself, or with partners to the system and can then be individually analysed.

The fact that you have NTFS structure problems, but no physical problems means that for one reason or another the data that is being written to the drive is corrupt in some way. It does sound like the physical surface of the drives is OK.

I would re-format all the drives, set them up without using the RAID array, then check each disk by itself. That way you will be sure that the Seatool, or whatever else is looking at your drives correctly. There should be no errors of any description at this point.

What I think may be happening is that the RAID controller may be corrupting the data flow to the drives. Eliminating this may make all the problems go away.

If there are, re-format, and re-test the structure of the drives again. If you keep seeing errors, you will have to do this step many, many times. Find a tool that is fast and easy to use. Partition Magic, orAcronis Disk Director etc etc.

What you are looking for is consistency. IF you are getting errors, but they are not always on the same drive then it is definately NOT a HDD issue. If you are getting NTFS file structure errors, but they occur on different drives depending how many drives you have connected, which is master/ slave etc, then it is almost certainly the controller on the mobo.

If you only ever get errors when all the drives are connected, then you may have a PSU issue.

You will need to be as logical as possible when you do this, and record the results of every step. You will need to record, which drives were in the machine, which cables you were using, which was the master/slave etc etc.

There are many possible causes for HDD corruption - I have just been through the same process as you may have to over the last few weeks. Here was my list of possible causes.

Dodgy drive
Dodgy drive firmware
Power cable to the drive
Data cable to the drive
Power cable run from the PSU - ie there will several different runs of cables, with say 4 molex connectors on each. Try a different run, as there could be a wire fault near the beginning of it
Faulty RAID controller chip
Dodgy RAID firmware
Dodgy RAID driver for windows
SATA controller chip
SATA firmware - by switching from RAID to native SATA you will eliminate some of the above.
Faulty BIOS chip
Faulty BIOS firmware - my BIOS was finding drives of different sizes each time I booted up my machine
Try flashing the BIOS with the latest version (back up the old one)
Make sure you remove the CMOS battery and reset the mobo to defaults as well
PSU issues
Faulty memory - use memtest to verify

That is a nasty list of possiblities. It will take you a long time to work your way through them all. With the components you have, the only things you will not be able to eliminate will the a faulty PSU or a hardware issue with the mobo.

When all of the above has been done, if you are still getting errors (like I did), then you will have to swap out the PSU. If this fixes it then bin the old PSU. If it does not, its the mobo.

Hopefully someone else will look through the above and make suggestions / corrections before you embark on the above.

I think I identified that I had both a faulty memory module and that my ATA controller was flaky. Both have been RMA'ed, so in time I will know if I was correct.

EDIT - I would do the memory testing sooner rather than later. Get memtest, select the relevant option for a continuous extended test and run it overnight.
Last edited by luminous on Sun Sep 19, 2004 4:07 am, edited 1 time in total.

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Sun Sep 19, 2004 4:02 am

On a positive note, at least you do not have an Athlon 64. As they have an integrated memory controller. So if a memory fault came up, it could be either memory modules, CPU, mobo or PSU. Not many ppl have a spare CPU to swap out :)

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Sun Sep 19, 2004 5:22 am

Your problems have an uncanny similarity to the ones I experienced. My would power down randomly (I have a thread in the off topic area). Whenever this happened the machine crashed.

I was certain it was a Windows issue, and that is something else you need to look into. I used Acronis off the recovery disk, therefore not running under windows. Sometimes I could restore an image, other times it would just crash. I, too, also got very little data corruption, I had lots of file structure errors. Sometimes my machine would also tell me that I did not have a HDD detected.

If you are a Linux user (which I am not), boot your machine into Linux and use it a lot. I expect that the same sort of problem will happen. This of course, rules out the OS. You can also achieve the same by booting Acronis off the recovery CD - this uses Linux. Do countless pointless tasks with the drives just shifting lots of data around. I bet it hangs at certain random intervals (like mine).

I eliminated just about everything I could think of. I used both of my HDDs in my spare machine for ages - no issues. I also used a spare drive in the new machine, and i corrupted.

I think you may be having a mobo issue here, it could also be PSU related. I think its more likely to be your mobo. I'm not saying its a hardware mobo fault, I just reckon its in there somewhere. The first thing is to get rid of that RAID complexity, that has got to help the fault finding process. A lot of RAID controllers have been known to cause odd problems like you are describing.

I wish you luck with the route ahead, as I really understand just what a pain in the arse it is. Hopefully we will both have healthy machines in the near future.

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Sun Sep 19, 2004 6:32 am

My data corruption was evident in several ways. The most notable being that data written to one partition turned up partly on another :D Partition Magic would have nothing to do with the drives as it said everything was corrupt.

Acronis Disk Director said the drive was corrupt and something to the effect of calling a recovery specialist (or was that PM8). Chkdsk also corrected about 1000 files that had been incorrectly indexed. The problems were very random.

Maybe we both are suffering from dodgy PSUs. Mine is a 9 month old Cheiftec 420W, but its been modded with a very low airflow 80mm fan. It runs much hotter now that it ever did.

The thing that confuses me a little is that when the machine first powers on the load going through it is not that high. The CPU is idle, the GPU is idle, and they hog power. The machine never crashed using benchmarks. It normally crashed out when it was just sitting there doing nothing.

It could be PSU related, but the problems did not go away when all the non-essential components were unplugged taking load off the PSU. I did not have a PSU to swap out with. I have an old machine, but I must have one working machine. I have a golden rule, you never fiddle with both machines at the same time, especially when you know one is buggered. I looked for my old generic 250W PSU, but I'd thrown it in the bin, best place for it really :)

Hopefully I guessed right and it is the mobo, otherwise I'll be charged for my RMA and will then have to buy another PSU.

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Sun Sep 19, 2004 7:14 am

Go for that new PSU, you know you want to. Get something really nice, and it may just solve your problems in one fell swoop. Those Antec units are just too damn noisey imo.

Multimeters are OK, but they are nothing special. They will just deliver another bunch of confusing numbers. They can only measure what is happening at that instant, so if you are not measuring just the right thing when the error occurs then you will need see what you need to. Also, you will not be sure what measurement to take. Just because the HDDs are connected to the +5v and +12V lines does not meant that is where the problem has to be. If there is an issue with the 3.3V line that causes the mobo to "hiccup" you will not be able to measure it.

Only sure way to test a PSU is to swap it for another :(

Putz
Posts: 368
Joined: Thu Aug 21, 2003 1:25 am
Location: Ottawa, Canada
Contact:

Post by Putz » Sun Sep 19, 2004 9:43 am

Just a few days ago, I also had the problem of my hard drive not starting up reliably (or even power cycling while the computer was on). Also, high power draw (games, stress test, etc.) would crash the computer. It was a bad Antec power supply.

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Sun Sep 19, 2004 10:02 am

I think part of the problem we are having is that our machines do not crash under high power draw situations......which is just plain weird. Glad you found your fault and fixed it....I take it you have a nicer PSU now ?? :P :D

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Tue Sep 21, 2004 9:36 am

Would it help me to say that I have a smile on my face right now?? :D :D :D If you do manage to get your hands on said stethoscope let us all know what a drive motor sounds like :)

I am glad that you are getting consistent problems, that makes it so much easier to sort out. My suggestion would be once you have transferred your RAID onto the new disk, unplug the problem drive. Use your machine for several days noting if you have any problems. Hopefully you won't, and that will be it, problem solved.

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Thu Sep 23, 2004 9:15 am

As you know, I may have to get a new PSU, if I do, this will be it.

http://www.kustompcs.co.uk/acatalog/info_1765.html

It will be a lot quieter than your old Antec (unless you have done a fan swap)

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Thu Sep 23, 2004 10:09 am

Some cheeky bugger must have bought them out !! I'm sure there was stock earlier.... :)

There are a few other places that sell them over here, all is not lost for me yet :P

burcakb
Posts: 1443
Joined: Tue Mar 09, 2004 9:05 am
Location: Turkey

Post by burcakb » Thu Sep 23, 2004 2:36 pm

Just to let you know:

I recently put together an Athlon64 system. I first tried it out with my old 'cuda IV. I was using the stock 3700AMB PSU at the time. No problems. Then I took out the 'cuda IV, plugged in two SATA Seagate 120s. Lo and behold, the drives did not power up on power-on. A touch to the reset brought them up.

I was suspicious about the Abit mobo SATA controller, but then just on an instinct, I pulled out the Antec PSU, plugged in an ancient Fortron. Drives power up nicely, no problems.

Hardware involved:
Seagate 120GB SATA drives with 8MB cache (two of them)
Antec SmartPower 350 PSU, unmodded, brand new (4 days use)
Fortron 300W PSU (FSP300-ATV), 3 year old PSU with modded fan.

On spec, the Antec has 16A on the 12V line, 35A on the 5V line, The Fortron has 15A on the 12V, 30A on the 5V line. Seagate specs say the drive uses only 12V line.

I guess it's more an issue of Seagate Sata drives not liking Antec PSUs.

Putz
Posts: 368
Joined: Thu Aug 21, 2003 1:25 am
Location: Ottawa, Canada
Contact:

Post by Putz » Thu Sep 23, 2004 7:20 pm

luminous wrote:I think part of the problem we are having is that our machines do not crash under high power draw situations......which is just plain weird. Glad you found your fault and fixed it....I take it you have a nicer PSU now ?? :P :D
Hoping it gets here tomorrow; thanks for asking :?

At the moment, I have an old Sparkle 300W (with "Noise Killer" -- what a joke) hooked up, and it's working a'ight.

Project
Posts: 152
Joined: Mon Dec 01, 2003 9:24 pm

Post by Project » Thu Sep 23, 2004 8:16 pm

u can rule out the file system error from seatools. I get that on like all my seagate hds, that it becomes normal. I even rmad all of them one time. So unless im doing something universally wrong with my hds ( doubt it) then file system error portion of your problem u can ignore =)

luminous
Patron of SPCR
Posts: 717
Joined: Sat Oct 04, 2003 6:31 am
Location: UK

Post by luminous » Fri Sep 24, 2004 1:32 am

Yeah, I'm waiting on my RMA'ed mobo - going to take 6-8 weeks more. Thats got to be an amazingly slow turnaround, given that they have already had 2.

Its just tempting to go out and buy a new one and be done with it.

Post Reply