ForceSSE causing me problems
Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee
ForceSSE causing me problems
I had been running forceasm on my Barton 2500 (overclocked to 2.2 ghz), but when I picked up a p859 that was showing 1 hour per frame, I changed to forcesse. Well, that WU terminated, so I slowed down my OC a bit and it picked up a smaller protein and all was fine until it reached the very end. It finished processing at about 4 minutes per frame but after it finished and was trying to get the wu ready to send, it terminated with a file io error. Now this is getting downright annoying! Why in the world wait till after the processing is done to have a problem?
I backed off the overclock even more ( only running 100mhz over stock) and will see what happens. Do these Bartons have problems with forcesse? Oh, I am using the new fah78core, btw.
I backed off the overclock even more ( only running 100mhz over stock) and will see what happens. Do these Bartons have problems with forcesse? Oh, I am using the new fah78core, btw.
-
- *Lifetime Patron*
- Posts: 1465
- Joined: Sun Mar 09, 2003 12:27 pm
- Location: Reading.England.EU
Not in general. My XP2500 is currently running at spec and has been OK for about 3 weeks with forceSSE no problems.
It also ran at 189*11 for about 1 week with no problems with forceSSE.
Except I found it did not o/c that easily - no way could I get mine stable 11*200 at 1.85VCore (didn't try higher). David Hays has a couple of XP2500 running 200*11 at stock VCore IIRC, presumably forceSSE and is clocking up a few points.
It also ran at 189*11 for about 1 week with no problems with forceSSE.
Except I found it did not o/c that easily - no way could I get mine stable 11*200 at 1.85VCore (didn't try higher). David Hays has a couple of XP2500 running 200*11 at stock VCore IIRC, presumably forceSSE and is clocking up a few points.
I loaded that core right after it first came out.
I went home at lunch and checked again. The second wu it worked on did exactly the same thing. It failed after all the processing was done and it was trying to write the data out.
Something I noticed when I rebooted was that when EMIII came up, it tried to start another incidence of fah, but based off of my wife's machine. I had set EMIII to check her stats, but I had no idea it would do this. I had to kill that incidence, but I kind of wonder if by doing this, it messed up my permissions for the work folder or something.
I think what I may do is delete the work folder and start over fresh....after I delete EMIII and make sure there are no stray registry entries. I doubt it is my overclock as that has been rock stable at 200x10.5 for a long time. If anyone has any other ideas, I am open to any suggestions.
Thanks,
Wendell
I went home at lunch and checked again. The second wu it worked on did exactly the same thing. It failed after all the processing was done and it was trying to write the data out.
Something I noticed when I rebooted was that when EMIII came up, it tried to start another incidence of fah, but based off of my wife's machine. I had set EMIII to check her stats, but I had no idea it would do this. I had to kill that incidence, but I kind of wonder if by doing this, it messed up my permissions for the work folder or something.
I think what I may do is delete the work folder and start over fresh....after I delete EMIII and make sure there are no stray registry entries. I doubt it is my overclock as that has been rock stable at 200x10.5 for a long time. If anyone has any other ideas, I am open to any suggestions.
Thanks,
Wendell
-
- Friend of SPCR
- Posts: 96
- Joined: Fri Dec 05, 2003 11:03 am
- Location: Seattle, Washington, United States
Getting ready to bring 2 AMD folders on line next week, and I don't know anything about a special core. If you know where it is could I impose and ask you to point me in a good direction to learn more and get it? Thanks for the tip.aston wrote:Did you try the new core that fixes Athlon lockup problems?
Never had an AMD machine in my life so this is all new stuff to me. Funny what this folding bug will do to you.
Wendell,
This is quite distressing, but I suggest that the problem is not Barton related, nor, most likely, does it have anything to do with your overclock. The reason I say this is because I am running 4 Bartons, all running overclocked ([email protected]), and all with -forceSSE, and I have had only one EARLY_UNIT_END on any of these in several weeks. During that same period I also had one EARLY_UNIT_END on one of my non-overclocked T'breds.
OK, so what other things could cause these EARLY_UNIT_ENDs? Unfortunately, the only thing that jumps to mind is the work getting corrupted by the second instance of F@H. Do change EMIII to NOT start F@H. Running two instances against the same work folder is a "bad thing". The new client is supposed to recognize that an instance is already running and immediately shut down, but it doesn't sound like it did in your case.
Of course, it definitely COULD BE the overclock. In fact, normally this would be the first thing I would suggest, but it doesn't make sense in your case, since you reduced the OC and are still getting the errors.
If I think of anything else, I'll be back. In the meantime, it looks like removing -forceSSE is called for, and you might as well bump the frequency back up.
Frustrating.
David
This is quite distressing, but I suggest that the problem is not Barton related, nor, most likely, does it have anything to do with your overclock. The reason I say this is because I am running 4 Bartons, all running overclocked ([email protected]), and all with -forceSSE, and I have had only one EARLY_UNIT_END on any of these in several weeks. During that same period I also had one EARLY_UNIT_END on one of my non-overclocked T'breds.
OK, so what other things could cause these EARLY_UNIT_ENDs? Unfortunately, the only thing that jumps to mind is the work getting corrupted by the second instance of F@H. Do change EMIII to NOT start F@H. Running two instances against the same work folder is a "bad thing". The new client is supposed to recognize that an instance is already running and immediately shut down, but it doesn't sound like it did in your case.
Of course, it definitely COULD BE the overclock. In fact, normally this would be the first thing I would suggest, but it doesn't make sense in your case, since you reduced the OC and are still getting the errors.
If I think of anything else, I'll be back. In the meantime, it looks like removing -forceSSE is called for, and you might as well bump the frequency back up.
Frustrating.
David
-
- *Lifetime Patron*
- Posts: 1465
- Joined: Sun Mar 09, 2003 12:27 pm
- Location: Reading.England.EU
Dont panic: if you are setting up new systems then you will (no doubt) install the v4 client and that will pull the v1.56 core. The v1.56 core was new in February and resolved isolated lockups some WU had on some AMD.Douglas Bailey wrote:Getting ready to bring 2 AMD folders on line next week, and I don't know anything about a special core. If you know where it is could I impose and ask you to point me in a good direction to learn more and get it? Thanks for the tip.
I have never had an SSE problem on any of my WU/CPUs (all AMD), and AFAIK the v1.56 resolved all known/reproducable AMD problems.
Look here: http://forums.silentpcreview.com/viewto ... 4&start=60 and check out my experiences. On those big 160 pointers, I got almost a threefold increase in speed by going from 3DNow to SSE =)zuperdee wrote:I am curious--why would anyone want to do --forceSSE in the first place? Isn't 3DNow! supposed to be better?
Bryan
Interesting... So on my Thoroughbred core, I should probably use -forceSSE? Man, and all this time, I've been happily chugging away at them with 3DNow!!!! I wonder how much higher my score would be if I had used -forceSSE all this time. Suffice it to say I'm a bit surprised by this. I thought 3DNow! was supposed to be better than SSE.
So Athlon64 people should NOT use it?
So Athlon64 people should NOT use it?
I don't know how to answer this with a yes or no, because I don't know what you mean by "it" =P. I'll just make this really clear:zuperdee wrote:So Athlon64 people should NOT use it?
Adding the -forcesse option to my Athlon64 nearly tripled the speed at which it was folding.
People seem to have similar experiences with AthlonXP's (though maybe not quite as dramatic).
Bryan
Oops--yeah, by "it," I meant -forceSSE. My guess would be that the difference wouldn't be as great on AthlonXP's, #1 because they are 32-bit, and #2 because they only have SSE implemented, while the Athlon64's have both SSE and SSE2.bcassell wrote:I don't know how to answer this with a yes or no, because I don't know what you mean by "it" =P. I'll just make this really clear:zuperdee wrote:So Athlon64 people should NOT use it?
Adding the -forcesse option to my Athlon64 nearly tripled the speed at which it was folding.
People seem to have similar experiences with AthlonXP's (though maybe not quite as dramatic).
Bryan
Not trying to give you a hard time here, but I thought I'd clear up some technical things:zuperdee wrote: Oops--yeah, by "it," I meant -forceSSE. My guess would be that the difference wouldn't be as great on AthlonXP's, #1 because they are 32-bit, and #2 because they only have SSE implemented, while the Athlon64's have both SSE and SSE2.
1) Unless you're running XP 64-bit beta or a version of linux compiled for AMD64, AND you somehow have a version of F@H recompiled for AMD64 (which I'm pretty sure doesn't exist yet) then it makes no difference that the A64 is 64-bit, because it's not being used. Also note that the fact that it's 64-bit would not really help F@H. 64-bit referes to the integer units and memory addressing, F@H is completely floating-point based. Floating point can be done with the x87 floating point instructions, 3DNow!, SSE, etc. All of which are un-changed in the athlon64. What MIGHT help, is that the number of general purpose registers in AMD64 (or IA-32e or whatever the hell Intel called it =p) has been doubled to 16. I don't know enough about the F@H code to know how much that would help though.
2) I think (thought I'm not completely sure, somebody want to back me up on this?) that the only thing SSE2 offered over SSE was the ability to work on data in 64 bit chunks, rather than just 32. I beleive the data block size was kept at 128bit. What this means is that with SSE you can process 4 32 bit data chunks at once, while SSE2 allows you to process 2 64 bit chunks at once. While I'm sure this will make a difference with the new double core (see this thread), it shouldn't make a difference with anything else. Again, my knowledge of SSE/SSE2 is pretty vague, somebody correct me if I'm wrong =)
Bryan
the p859 work units are using FAHcore_79. it is the same as the 78 core, but uses double precision floating point operations. it has been giving everyone a headache. performance results are similar for everyone
flags you should add to increase performance
-advmethods -forceasm -forcesse -verbosity 9
-advmethods will get you gromacs
-forceasm will force assembly optimizations
-forcesse will force sse optimized code
-verbosity 9 gives most detailed log data
opterons and athlon 64's use x86-64. meaning they are a hybrid where 64 bit integer and registry operations were added to the x86 instuction set. they run faster becuase they have more efficent architectures. of course a code designed specifically for an architecture will performe better, however, the parts of the architecture that folding uses (floating point) have not been signifigantly altered from the previous architectures. sse/sse2 are multimedia optimized instructions. since multimedia is mainly fpu, programs that use sse/sse2 instructions get a performance boost.
flags you should add to increase performance
-advmethods -forceasm -forcesse -verbosity 9
-advmethods will get you gromacs
-forceasm will force assembly optimizations
-forcesse will force sse optimized code
-verbosity 9 gives most detailed log data
opterons and athlon 64's use x86-64. meaning they are a hybrid where 64 bit integer and registry operations were added to the x86 instuction set. they run faster becuase they have more efficent architectures. of course a code designed specifically for an architecture will performe better, however, the parts of the architecture that folding uses (floating point) have not been signifigantly altered from the previous architectures. sse/sse2 are multimedia optimized instructions. since multimedia is mainly fpu, programs that use sse/sse2 instructions get a performance boost.
SSE on XP1700+
Hey Everyone. This is my first post. Happy to be here, you guys are great!
I'm having a rough time getting the SSE optimization to work on an XP1700. I’m working on one of the 160 pointers right now and the XP1700’s taking close to 1:20 per frame. I tried the –forceSSE and –forceasm parameters, which work on other computers here, but this one doesn’t take. When I look at my CPU in SiSoftware Sandra it says that my CPU doesn’t even have SSE. Further research suggested a reinstall of WinXP, but even after that my FAH log still says “Extra 3DNow boost OK” and it still takes 1:20 per frame.
Anyone have suggestions? Also, I read that the –forceSSE and –forceasm parameters can be used simultaneously with AMD but only one or the other can be used with Intel. Is this true?
Keep up the great work!
Travis
I'm having a rough time getting the SSE optimization to work on an XP1700. I’m working on one of the 160 pointers right now and the XP1700’s taking close to 1:20 per frame. I tried the –forceSSE and –forceasm parameters, which work on other computers here, but this one doesn’t take. When I look at my CPU in SiSoftware Sandra it says that my CPU doesn’t even have SSE. Further research suggested a reinstall of WinXP, but even after that my FAH log still says “Extra 3DNow boost OK” and it still takes 1:20 per frame.
Anyone have suggestions? Also, I read that the –forceSSE and –forceasm parameters can be used simultaneously with AMD but only one or the other can be used with Intel. Is this true?
Keep up the great work!
Travis
Travis, what "core" does your cpu have? Palomino was, AFAIK, the first to support SSE. Thunderbird did not. cpu-z will tell you.
Both -forceSSE and -forceasm can be used with Intel processors, but since they will default to SSE, -forceSSE is ignored (redundant, superfluous).
David
Both -forceSSE and -forceasm can be used with Intel processors, but since they will default to SSE, -forceSSE is ignored (redundant, superfluous).
David
-forceSSE is the only way to tell F@H to use SSE on an AMD processor.Equiquay wrote:Hey David,
CPU-Z says it's a Palomino, but it only lists MMX(+) and 3DNow!(+) for instuctions. I've never had problems with this chip before, is there some other way to enable SSE instructions? Isn't WinXP supposed to do this?
Travis
Check your log and make sure F@H recognized all the command line switches. Perhaps you have one misspelled and it's ignoring the others?
Specify '-verbosity 9' to enable the most detailed logging.
David
Is this processor an upgrade from a previous processor that did not have SSE instruction code? Was the Windows XP reinstall a reformat as well? Or just a reinstall over an existing system folder? I'm led down the path to believe that if you upgraded the processor from a non-SSE processor to an SSE processor, the register was set with the initial Windows XP system install that SSE is not available. Thus the question about a fresh reformat and reinstallation. Fold on!
Just ran cpu-z on my Barton and it lists SSE. If your's doesn't list SSE and it doesn't work, well... draw your own conclusion (although I thought all Palominos had SSE.)
Sounds like the prescription is a new 2400+ cpu - $68 at newegg.
M
Edit: You may want to run "cpuid" also. It should tell you yay or nay if it's on your processor. Get it at the amd site - just search for cpuid. Streaming SIMD Extension.
Sounds like the prescription is a new 2400+ cpu - $68 at newegg.
M
Edit: You may want to run "cpuid" also. It should tell you yay or nay if it's on your processor. Get it at the amd site - just search for cpuid. Streaming SIMD Extension.
Hey Folks,
Thanks for your responses. I've tried everything I could think of and have just come to the conclusion that somehow the SSE on my chip is broken. All Athlon chips that are model 6 and above are supposed to have SSE and mine's a model 6, but it's definitely not working. I found a couple places that suggest the program WCPUID to enable SSE and strangely enough, WCPUID shows SSE on my chip as already enabled, but it still isn't listed as an instruction to CPU-Z. I did try a reformat of the hard drive before the full WinXP install, and the only processor I’ve had on this board is the XP1700. I flashed the BIOS and double checked that the -forceSSE and -forceasm parameters are making their way to the FAH log file. If anyone else has this problem down the line I found some useful information on the Adobe website here and in an old SPCR post here, but for me its still no go.
It does give me an excuse to get a new one though. In the meantime, y'all had better watch out 'cuz in 130 hours this bad boy's going to cash in, and then I’m going to be rollin' in the points.
lol.
Thanks for your responses. I've tried everything I could think of and have just come to the conclusion that somehow the SSE on my chip is broken. All Athlon chips that are model 6 and above are supposed to have SSE and mine's a model 6, but it's definitely not working. I found a couple places that suggest the program WCPUID to enable SSE and strangely enough, WCPUID shows SSE on my chip as already enabled, but it still isn't listed as an instruction to CPU-Z. I did try a reformat of the hard drive before the full WinXP install, and the only processor I’ve had on this board is the XP1700. I flashed the BIOS and double checked that the -forceSSE and -forceasm parameters are making their way to the FAH log file. If anyone else has this problem down the line I found some useful information on the Adobe website here and in an old SPCR post here, but for me its still no go.
It does give me an excuse to get a new one though. In the meantime, y'all had better watch out 'cuz in 130 hours this bad boy's going to cash in, and then I’m going to be rollin' in the points.
lol.