Page 1 of 1

Hit by AMD lockup?

Posted: Wed Jan 28, 2004 8:10 am
by Zyzzyx
Looks like my server was, last night. Its an AMD 1.4 Palomino. Can't be having that lock up and hang on me. Its my web server and mail server.

So, instead of taking it out of production completely, I've just removed the SSE flag for now. Need to double check, but I think it was the forced SSE causing the AMD lockups.

Durnit, and it was my best producer too.

Posted: Wed Jan 28, 2004 8:23 am
by mas92264
Yeah, the Palominos aren't supposed to lock up when using sse. I only have one folding (for a couple of months now) with sse and I don't think it ever has locked up. Some of my Tbreds and Bartons lock up once or twice a week, others rarely.

Weird deal. AMD is supposed to be working on this specific issue now. No recent news, tho.

M

Posted: Wed Jan 28, 2004 9:01 am
by Zyzzyx
This is really about the third or fourth time its locked up. Only the first time I noticed that FAH started on a new WU. Went to bed it had 14 hours left. Rebooted it this morning, downloaded a new WU.

I'll double check, but I'm durn sure that its a Palomino core.

Posted: Wed Jan 28, 2004 9:38 am
by mas92264
Did you lose a partially completed wu? I never have lost a wu due to a lock up.

Palomino is model 6.

M

Posted: Wed Jan 28, 2004 10:00 am
by Zyzzyx
Yeah, looks like I did.

The fun part is I could have other problems instead. :( Dunno for sure.

Posted: Wed Jan 28, 2004 10:29 am
by mas92264
If it was me, I would look somewhere other than sse. Probably a good idea to leave it off for now, just to see what the deal is.

Ain't trouble shooting a blast? One of the ofc file servers was locking up nearly every day and over a period of a few weeks, I replaced every part in the box. Never did figure out exactly what the problem was, although the cat 5 wall socket had some crossed wires :oops: .

Posted: Wed Jan 28, 2004 7:31 pm
by ColdFlame
My Palomino locked up on a particular WU consistently so I don't know who came up with the notion that Palominos don't lock up. It would also seem that they reused SSE implementation in all CPUs since Palomino so I'd assume they all should lock the same.

Now, when I had the lock up, I'd just remove that particular WU and make sure I get another one from the assignment server. No problem ever since.

Additionally, I hope you know (I know you do :) ) that SSE boosts your PPW by something like x2.

Posted: Wed Jan 28, 2004 7:34 pm
by haysdb
My Linux server locked up two or three times during a period of a couple of days, so I removed -forcesse and it hasn't locked up since. Whether this was because of SSE directly, or a side-effect (perhaps thermal related), I don't know, but one thing is certain, when the server does down, that's A BAD THING.


[I THOUGHT my server had a T'bred in it, so last night I took it apart to put in an 1800+ Palomino, but it turned out the cpu it had in it already WAS a Palomino. I replaced the cpu anyway since, if I have to turn SSE off, it might as well be on my slowest cpu. I also discovered two other things: 1) the "grill" surrounding the heatpipes was somewhat "caked" with dust. 2) I had failed to remove the plastic coating from the copper shim surrounding the cpu die. These two factors certainly weren't helping anything.]

David

Posted: Wed Jan 28, 2004 10:17 pm
by Zyzzyx
My 1700 TBred just locked up too. It seems to do so about once/month. Since its mostly just my media server, I don't really mind too much.