ForceSSE causing me problems

A forum just for SPCR's folding team... by request.

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

Post Reply
wgragg
Posts: 246
Joined: Thu Nov 13, 2003 7:34 am

ForceSSE causing me problems

Post by wgragg » Fri Mar 26, 2004 6:15 am

I had been running forceasm on my Barton 2500 (overclocked to 2.2 ghz), but when I picked up a p859 that was showing 1 hour per frame, I changed to forcesse. Well, that WU terminated, so I slowed down my OC a bit and it picked up a smaller protein and all was fine until it reached the very end. It finished processing at about 4 minutes per frame but after it finished and was trying to get the wu ready to send, it terminated with a file io error. Now this is getting downright annoying! Why in the world wait till after the processing is done to have a problem?

I backed off the overclock even more ( only running 100mhz over stock) and will see what happens. Do these Bartons have problems with forcesse? Oh, I am using the new fah78core, btw.

dukla2000
*Lifetime Patron*
Posts: 1465
Joined: Sun Mar 09, 2003 12:27 pm
Location: Reading.England.EU

Post by dukla2000 » Fri Mar 26, 2004 8:33 am

Not in general. My XP2500 is currently running at spec and has been OK for about 3 weeks with forceSSE no problems.

It also ran at 189*11 for about 1 week with no problems with forceSSE.

Except I found it did not o/c that easily - no way could I get mine stable 11*200 at 1.85VCore (didn't try higher). David Hays has a couple of XP2500 running 200*11 at stock VCore IIRC, presumably forceSSE and is clocking up a few points.

aston
Posts: 139
Joined: Fri Jan 16, 2004 2:47 pm
Location: Victoria, BC

Post by aston » Fri Mar 26, 2004 10:01 am

Did you try the new core that fixes Athlon lockup problems? I don't know if it'll help, but it can't hurt. OTOH, it doesn't sound like your computer is hanging..

wgragg
Posts: 246
Joined: Thu Nov 13, 2003 7:34 am

Post by wgragg » Fri Mar 26, 2004 10:57 am

I loaded that core right after it first came out.

I went home at lunch and checked again. The second wu it worked on did exactly the same thing. It failed after all the processing was done and it was trying to write the data out.

Something I noticed when I rebooted was that when EMIII came up, it tried to start another incidence of fah, but based off of my wife's machine. I had set EMIII to check her stats, but I had no idea it would do this. I had to kill that incidence, but I kind of wonder if by doing this, it messed up my permissions for the work folder or something.

I think what I may do is delete the work folder and start over fresh....after I delete EMIII and make sure there are no stray registry entries. I doubt it is my overclock as that has been rock stable at 200x10.5 for a long time. If anyone has any other ideas, I am open to any suggestions.

Thanks,
Wendell

Douglas Bailey
Friend of SPCR
Posts: 96
Joined: Fri Dec 05, 2003 11:03 am
Location: Seattle, Washington, United States

Post by Douglas Bailey » Fri Mar 26, 2004 11:12 am

aston wrote:Did you try the new core that fixes Athlon lockup problems?
Getting ready to bring 2 AMD folders on line next week, and I don't know anything about a special core. If you know where it is could I impose and ask you to point me in a good direction to learn more and get it? Thanks for the tip.

Never had an AMD machine in my life so this is all new stuff to me. Funny what this folding bug will do to you. :)

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Mar 26, 2004 11:23 am

Wendell,

This is quite distressing, but I suggest that the problem is not Barton related, nor, most likely, does it have anything to do with your overclock. The reason I say this is because I am running 4 Bartons, all running overclocked (3@2.2GHz), and all with -forceSSE, and I have had only one EARLY_UNIT_END on any of these in several weeks. During that same period I also had one EARLY_UNIT_END on one of my non-overclocked T'breds.

OK, so what other things could cause these EARLY_UNIT_ENDs? Unfortunately, the only thing that jumps to mind is the work getting corrupted by the second instance of F@H. Do change EMIII to NOT start F@H. Running two instances against the same work folder is a "bad thing". The new client is supposed to recognize that an instance is already running and immediately shut down, but it doesn't sound like it did in your case.

Of course, it definitely COULD BE the overclock. In fact, normally this would be the first thing I would suggest, but it doesn't make sense in your case, since you reduced the OC and are still getting the errors.

If I think of anything else, I'll be back. In the meantime, it looks like removing -forceSSE is called for, and you might as well bump the frequency back up.

Frustrating.

David

dukla2000
*Lifetime Patron*
Posts: 1465
Joined: Sun Mar 09, 2003 12:27 pm
Location: Reading.England.EU

Post by dukla2000 » Fri Mar 26, 2004 1:50 pm

Douglas Bailey wrote:Getting ready to bring 2 AMD folders on line next week, and I don't know anything about a special core. If you know where it is could I impose and ask you to point me in a good direction to learn more and get it? Thanks for the tip.
Dont panic: if you are setting up new systems then you will (no doubt) install the v4 client and that will pull the v1.56 core. The v1.56 core was new in February and resolved isolated lockups some WU had on some AMD.

I have never had an SSE problem on any of my WU/CPUs (all AMD), and AFAIK the v1.56 resolved all known/reproducable AMD problems.

zuperdee
Posts: 310
Joined: Tue Jan 20, 2004 8:24 pm

Post by zuperdee » Fri Mar 26, 2004 2:32 pm

I am curious--why would anyone want to do --forceSSE in the first place? Isn't 3DNow! supposed to be better?

bcassell
Patron of SPCR
Posts: 70
Joined: Tue Mar 23, 2004 1:01 pm
Location: San Jose, CA, USA

Post by bcassell » Fri Mar 26, 2004 2:39 pm

zuperdee wrote:I am curious--why would anyone want to do --forceSSE in the first place? Isn't 3DNow! supposed to be better?
Look here: http://forums.silentpcreview.com/viewto ... 4&start=60 and check out my experiences. On those big 160 pointers, I got almost a threefold increase in speed by going from 3DNow to SSE =)

Bryan

wgragg
Posts: 246
Joined: Thu Nov 13, 2003 7:34 am

Post by wgragg » Fri Mar 26, 2004 2:42 pm

Yeah, doing it lowered my time per frame from 1 hour 5 minutes to about 25 minutes.

zuperdee
Posts: 310
Joined: Tue Jan 20, 2004 8:24 pm

Post by zuperdee » Fri Mar 26, 2004 2:48 pm

Interesting... So on my Thoroughbred core, I should probably use -forceSSE? Man, and all this time, I've been happily chugging away at them with 3DNow!!!! :lol: I wonder how much higher my score would be if I had used -forceSSE all this time. Suffice it to say I'm a bit surprised by this. I thought 3DNow! was supposed to be better than SSE.

So Athlon64 people should NOT use it?

bcassell
Patron of SPCR
Posts: 70
Joined: Tue Mar 23, 2004 1:01 pm
Location: San Jose, CA, USA

Post by bcassell » Fri Mar 26, 2004 2:55 pm

zuperdee wrote:So Athlon64 people should NOT use it?
I don't know how to answer this with a yes or no, because I don't know what you mean by "it" =P. I'll just make this really clear:

Adding the -forcesse option to my Athlon64 nearly tripled the speed at which it was folding.

People seem to have similar experiences with AthlonXP's (though maybe not quite as dramatic).

Bryan

zuperdee
Posts: 310
Joined: Tue Jan 20, 2004 8:24 pm

Post by zuperdee » Fri Mar 26, 2004 3:00 pm

bcassell wrote:
zuperdee wrote:So Athlon64 people should NOT use it?
I don't know how to answer this with a yes or no, because I don't know what you mean by "it" =P. I'll just make this really clear:

Adding the -forcesse option to my Athlon64 nearly tripled the speed at which it was folding.

People seem to have similar experiences with AthlonXP's (though maybe not quite as dramatic).

Bryan
Oops--yeah, by "it," I meant -forceSSE. My guess would be that the difference wouldn't be as great on AthlonXP's, #1 because they are 32-bit, and #2 because they only have SSE implemented, while the Athlon64's have both SSE and SSE2.

bcassell
Patron of SPCR
Posts: 70
Joined: Tue Mar 23, 2004 1:01 pm
Location: San Jose, CA, USA

Post by bcassell » Fri Mar 26, 2004 3:15 pm

zuperdee wrote: Oops--yeah, by "it," I meant -forceSSE. My guess would be that the difference wouldn't be as great on AthlonXP's, #1 because they are 32-bit, and #2 because they only have SSE implemented, while the Athlon64's have both SSE and SSE2.
Not trying to give you a hard time here, but I thought I'd clear up some technical things:

1) Unless you're running XP 64-bit beta or a version of linux compiled for AMD64, AND you somehow have a version of F@H recompiled for AMD64 (which I'm pretty sure doesn't exist yet) then it makes no difference that the A64 is 64-bit, because it's not being used. Also note that the fact that it's 64-bit would not really help F@H. 64-bit referes to the integer units and memory addressing, F@H is completely floating-point based. Floating point can be done with the x87 floating point instructions, 3DNow!, SSE, etc. All of which are un-changed in the athlon64. What MIGHT help, is that the number of general purpose registers in AMD64 (or IA-32e or whatever the hell Intel called it =p) has been doubled to 16. I don't know enough about the F@H code to know how much that would help though.

2) I think (thought I'm not completely sure, somebody want to back me up on this?) that the only thing SSE2 offered over SSE was the ability to work on data in 64 bit chunks, rather than just 32. I beleive the data block size was kept at 128bit. What this means is that with SSE you can process 4 32 bit data chunks at once, while SSE2 allows you to process 2 64 bit chunks at once. While I'm sure this will make a difference with the new double core (see this thread), it shouldn't make a difference with anything else. Again, my knowledge of SSE/SSE2 is pretty vague, somebody correct me if I'm wrong =)

Bryan

eneuman
Posts: 3
Joined: Fri Mar 26, 2004 11:18 pm
Location: Overland Park, KS

Post by eneuman » Sat Mar 27, 2004 12:10 am

the p859 work units are using FAHcore_79. it is the same as the 78 core, but uses double precision floating point operations. it has been giving everyone a headache. performance results are similar for everyone

flags you should add to increase performance
-advmethods -forceasm -forcesse -verbosity 9

-advmethods will get you gromacs
-forceasm will force assembly optimizations
-forcesse will force sse optimized code
-verbosity 9 gives most detailed log data

opterons and athlon 64's use x86-64. meaning they are a hybrid where 64 bit integer and registry operations were added to the x86 instuction set. they run faster becuase they have more efficent architectures. of course a code designed specifically for an architecture will performe better, however, the parts of the architecture that folding uses (floating point) have not been signifigantly altered from the previous architectures. sse/sse2 are multimedia optimized instructions. since multimedia is mainly fpu, programs that use sse/sse2 instructions get a performance boost.

wgragg
Posts: 246
Joined: Thu Nov 13, 2003 7:34 am

Post by wgragg » Sat Mar 27, 2004 5:49 am

I am crunching a p859 right now and looked in my fah folder. The only 2 cores that I see are the fah65 and 78. Where is the 79 core supposed to be?

mas92264
Patron of SPCR
Posts: 659
Joined: Fri Sep 26, 2003 5:26 pm
Location: Palm Springs, CA, USA

Post by mas92264 » Sat Mar 27, 2004 6:39 am

p859 uses core 78. Core 79 is used for p934 and p935. Those are the only 2 that I've seen that use the new core. Guess there could be others.

M

Equiquay
Posts: 11
Joined: Sun Mar 28, 2004 3:34 pm
Location: Denver, CO

SSE on XP1700+

Post by Equiquay » Sun Mar 28, 2004 3:50 pm

Hey Everyone. This is my first post. Happy to be here, you guys are great! :)

I'm having a rough time getting the SSE optimization to work on an XP1700. I’m working on one of the 160 pointers right now and the XP1700’s taking close to 1:20 per frame. I tried the –forceSSE and –forceasm parameters, which work on other computers here, but this one doesn’t take. When I look at my CPU in SiSoftware Sandra it says that my CPU doesn’t even have SSE. Further research suggested a reinstall of WinXP, but even after that my FAH log still says “Extra 3DNow boost OK” and it still takes 1:20 per frame.

Anyone have suggestions? Also, I read that the –forceSSE and –forceasm parameters can be used simultaneously with AMD but only one or the other can be used with Intel. Is this true?

Keep up the great work!

Travis

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sun Mar 28, 2004 4:14 pm

Travis, what "core" does your cpu have? Palomino was, AFAIK, the first to support SSE. Thunderbird did not. cpu-z will tell you.

Both -forceSSE and -forceasm can be used with Intel processors, but since they will default to SSE, -forceSSE is ignored (redundant, superfluous).

David

Equiquay
Posts: 11
Joined: Sun Mar 28, 2004 3:34 pm
Location: Denver, CO

Post by Equiquay » Sun Mar 28, 2004 4:36 pm

Hey David,

CPU-Z says it's a Palomino, but it only lists MMX(+) and 3DNow!(+) for instuctions. I've never had problems with this chip before, is there some other way to enable SSE instructions? Isn't WinXP supposed to do this?
Travis

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Sun Mar 28, 2004 8:10 pm

Equiquay wrote:Hey David,

CPU-Z says it's a Palomino, but it only lists MMX(+) and 3DNow!(+) for instuctions. I've never had problems with this chip before, is there some other way to enable SSE instructions? Isn't WinXP supposed to do this?
Travis
-forceSSE is the only way to tell F@H to use SSE on an AMD processor.

Check your log and make sure F@H recognized all the command line switches. Perhaps you have one misspelled and it's ignoring the others?

Specify '-verbosity 9' to enable the most detailed logging.

David

Macaholic
Posts: 128
Joined: Sat Mar 06, 2004 4:24 pm
Location: Dark Side of the Moon
Contact:

Post by Macaholic » Sun Mar 28, 2004 8:24 pm

Is this processor an upgrade from a previous processor that did not have SSE instruction code? Was the Windows XP reinstall a reformat as well? Or just a reinstall over an existing system folder? I'm led down the path to believe that if you upgraded the processor from a non-SSE processor to an SSE processor, the register was set with the initial Windows XP system install that SSE is not available. Thus the question about a fresh reformat and reinstallation. Fold on!

mas92264
Patron of SPCR
Posts: 659
Joined: Fri Sep 26, 2003 5:26 pm
Location: Palm Springs, CA, USA

Post by mas92264 » Sun Mar 28, 2004 8:26 pm

Just ran cpu-z on my Barton and it lists SSE. If your's doesn't list SSE and it doesn't work, well... draw your own conclusion :( (although I thought all Palominos had SSE.)

Sounds like the prescription is a new 2400+ cpu - $68 at newegg. :)

M

Edit: You may want to run "cpuid" also. It should tell you yay or nay if it's on your processor. Get it at the amd site - just search for cpuid. Streaming SIMD Extension.

Equiquay
Posts: 11
Joined: Sun Mar 28, 2004 3:34 pm
Location: Denver, CO

Post by Equiquay » Sun Mar 28, 2004 10:55 pm

Hey Folks,

Thanks for your responses. I've tried everything I could think of and have just come to the conclusion that somehow the SSE on my chip is broken. All Athlon chips that are model 6 and above are supposed to have SSE and mine's a model 6, but it's definitely not working. I found a couple places that suggest the program WCPUID to enable SSE and strangely enough, WCPUID shows SSE on my chip as already enabled, but it still isn't listed as an instruction to CPU-Z. I did try a reformat of the hard drive before the full WinXP install, and the only processor I’ve had on this board is the XP1700. I flashed the BIOS and double checked that the -forceSSE and -forceasm parameters are making their way to the FAH log file. If anyone else has this problem down the line I found some useful information on the Adobe website here and in an old SPCR post here, but for me its still no go.

It does give me an excuse to get a new one though. :D In the meantime, y'all had better watch out 'cuz in 130 hours this bad boy's going to cash in, and then I’m going to be rollin' in the points.

lol. ;)

mas92264
Patron of SPCR
Posts: 659
Joined: Fri Sep 26, 2003 5:26 pm
Location: Palm Springs, CA, USA

Post by mas92264 » Mon Mar 29, 2004 6:25 am

You may want to email AMD. Be sure to give them all your info, mobo, bios and written data off the top of your chip. Also, there's some pretty smart guys that hang out at the AMD forum.

M

Post Reply