A DGromacs optimization - run another instance of FAH

A forum just for SPCR's folding team... by request.

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

Post Reply
haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

A DGromacs optimization - run another instance of FAH

Post by haysdb » Wed Apr 07, 2004 11:02 pm

May 30 Edit: This technique is essentially "obsolete" since the DGromacs core has been "fixed".


Disclaimer: This is an "advanced method," targeted to "competitive folders". :D

The DGromacs (double-precision Gromacs) are not using 100% cpu. It varies, but at worst, they use only about 50% cpu. At best, they use maybe 85%. When you see that the Idle Process has accumulated 26 bloody HOURS of cpu time, as it has on my desktop machine, and 64 hours on my Intel "blade", both of which fold 24/7, you know it's time to do something.

What I have done on the three PC's I have easy access to, is to add an ADDITIONAL FAH directory.
Remember to change the Machine ID on the extra instance by running the client with -config or -configonly. Machine ID is under "Advanced Options".
On my HyperThreaded machines, this means a THIRD directory. On the non-HT machine, a second directory. I do not run this extra instance as a service, but just from a shortcut on the desktop. I add the -oneunit flag along with -local -advmethods, etc., so that when the WU ends, it does not automatically pick up another one. This is so that in case the "permanent" instances happen to pick up a regular Gromacs, or God forbid, a Tinker, I won't run the "extra" instance beyond the completion of the current WU. This wouldn't be horrible, since my machines are plenty fast enough to return WUs before the deadlines even with multiple instances of FAH running, but I also feel it's important to return work as quickly as possible, and so long as FAH is using 100% cpu, there is no "points advantage" in running more than one or two (in the case of an HT cpu) instances of FAH.

It's when FAH is NOT using 100% cpu that I run the "extra" client, and even then, only after checking the Performance tab in the Task Manager to verify that a significant amount of cpu is going to the Idle Process.

How much difference does this make?

On my non-HT machine, LogStats said I was getting 386 PPW with the single DGromacs running. With a second DGromacs, 770. Virtually a 100% increase.

On one of my HT machines, two instances were producing 909 PPW, which ain't shaby, but the cpu was not being used anywhere near 100%, so I started a third instance. With 3, LogStats says 1024, a 12% increase.

Caveats:
  • This technique can result in your "performance fraction" dropping below .95, which could mean you don't get some of the largest most complex work units, like the 160 pointers. If it would cause your performance fraction to drop below .8, I do not recommend it at all.
  • This can sort of backfire. My second HT machine picked up a Tinker for the 3rd instance. :(
  • This only makes sense on machines that are convenient to get to. This is a "manual" technique, so it requires a "hands-on" approach to folding
Answers to anticipated questions:
  • How do I know I am working on a DGromacs?

    The DGromacs core is FahCore_79.exe, which you can see in the Task Manager in the 'Processes' tab.

    Better yet, install Electron Microscope III. It shows the DGromacs cores as a dark blue color, so you can see at a glance which core each of your clients is running.
  • How do I create a copy of FAH?
    • Right-click-and-drag an existing FAH folder and select 'Copy Here' from the option menu which appears when you release the right mouse button.
    • Rename the folder to whatever you want.
    • Descend into the copy and delete the
      • work folder
      • queue.dat
      • both log files
    • Right-click FAH4Console.exe and choose Create Shortcut
    • Right-click the shortcut, choose 'Properties' and add -oneunit and -configonly (or -config) in the 'Target:" box.
    • Drag the shortcut to your desktop for easy access
    • Double-click the shortcut to configure FAH.
    • Change the Machine ID to one not being used by another client on that PC.
    • Re-edit the shortcut to remove the -configonly flag
    • Double-click the shortcut to start the extra instance of FAH.
David
Last edited by haysdb on Sun May 30, 2004 10:52 am, edited 1 time in total.

ColdFlame
Posts: 451
Joined: Wed May 21, 2003 9:39 pm
Location: Somewhere in Time

Post by ColdFlame » Wed Apr 07, 2004 11:50 pm

All of my AMD machines are getting non-SSE2 WUs. My only p4-like machine (Celeron 2.7) gets these SSE2 thingies exclusively.

What is interesting is that I followed David's advice and created another folder with a copy of F@H and got a non-SSE2 WU. I wonder if they are looking at machineId or queue.dat history or what when scheduling?

AZBrandon
Friend of SPCR
Posts: 867
Joined: Sun Mar 21, 2004 5:47 pm
Location: Phoenix, AZ

Post by AZBrandon » Wed Apr 07, 2004 11:58 pm

What is the Performance Fraction, and how do you find out what it is for your own systems?

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Thu Apr 08, 2004 12:03 am

ColdFlame wrote:All of my AMD machines are getting non-SSE2 WUs. My only p4-like machine (Celeron 2.7) gets these SSE2 thingies exclusively.
All of my P4/Windows clients are working DGromacs, with only the very occasional exception.
All of my Athlon/Linux clients are NOT getting ANY DGromacs.

Makes sense, since the 32-bit Athlons do not have SSE2, right? But it could also be a Windows/Linux thing. :shrug:
What is interesting is that I followed David's advice and created another folder with a copy of F@H and got a non-SSE2 WU. I wonder if they are looking at machineId or queue.dat history or what when scheduling?
Sometimes you will, it's just "the luck of the draw." Like I said, I got a Tinker on one. But mostly on systems where I have had exclusively DGromacs, I have picked up another DGromacs with the additional client. I am all but certain they are NOT looking at the Machine ID or in the queue.dat for what other WU's are currently in progress, when scheduling work. They may be looking at the performance fraction, depending on the project/protein.

David

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Thu Apr 08, 2004 12:07 am

AZBrandon wrote:What is the Performance Fraction, and how do you find out what it is for your own systems?
If you have specified -verbosity 9, it will be shown in the log file at the end of each WU. Each client has its own performance fraction.

If you are running EMIII (Electron Microscope III), you can click just to the left of the little computer icon and it will pop open a window showing the contents of the queue.dat file, which contains the performance fraction. Just one more reason to love EMIII. :)

David

mas92264
Patron of SPCR
Posts: 659
Joined: Fri Sep 26, 2003 5:26 pm
Location: Palm Springs, CA, USA

Post by mas92264 » Fri Apr 09, 2004 7:10 am

Re: Running multiple instances while crunching DoubleGromacs.

This works! On my non-ht p4 2.4 processing a DGromac and only using maybe 50% of the cpu, I added another instance of folding, got a core78 gromac and my production jumped from about 375 ppw (for 1 23 point core79) to about 550 ppw (adding an 18 point core78.) :)

M

dasman
*Lifetime Patron*
Posts: 485
Joined: Thu Jan 08, 2004 10:59 am
Location: Erie, PA USA

Post by dasman » Fri Apr 09, 2004 11:14 am

I don't want to worry about the manual stuff with my office machines -- how much of a PPW hit would I take running an extra instance on them when I don't have any dgromacs.

IOW, can I just run two instances and forget about it? Or would I be killing myself on the regular gro side?

Dave

Zyzzyx
Friend of SPCR
Posts: 1063
Joined: Mon Dec 23, 2002 12:55 pm
Location: Richland, WA
Contact:

Post by Zyzzyx » Fri Apr 09, 2004 12:22 pm

I'm just letting the office machines chug through them. I don't want the hassle of installing a second client. On some slower systems it might cut your performance fraction back enought to not get the WU that you want. I figure that the DGromacs stuff will likely pass, just as Tinker influxes have in the past.

mas92264
Patron of SPCR
Posts: 659
Joined: Fri Sep 26, 2003 5:26 pm
Location: Palm Springs, CA, USA

Post by mas92264 » Fri Apr 09, 2004 1:26 pm

Yep, you would have to monitor what's going on and perhaps stop one and let the other finish. Then turn the stopped one back on if you had 2 single gromacs at a time. However, my Intel folders have been pretty much only getting DGromacs since they started back whenever.

Not much to it, really, but, it does require some extra attention. I think David tried running 2 78'rs at a time on either an AMD or not-ht processor and said that the performace penalty was about 2-4%? Doesn't seem life threatening. :)

M

haysdb
Patron of SPCR
Posts: 2425
Joined: Fri Aug 29, 2003 11:09 pm
Location: Earth

Post by haysdb » Fri Apr 09, 2004 4:43 pm

I agree, this is not something you probably want to be doing unless you are totally committed to squeezing out every possible point out of your systems. But if you ARE (trying to squeeze every possible point out of the hardware you have), AND you are getting a STEADY DIET of DGromac, then it's a technique that can yield a not insignificant number of additional points.
  • Don't bother doing this with Athlon XP processors, since these processors do not have the SSE2 instructions needed by the DGromacs work units. (Is anyone getting any DGromacs on Athlon processors?)
  • Don't do this if it would cause your PF (Performance Fraction) to fall below 0.8
  • Be aware that a new FahCore_79.exe is coming (I don't know when) that will probably "obsolete" this technique.
  • The advantage could also diminish with future work units, even if a new core is not released.
  • Running two clients, each running a regular Gromacs, results in a minimal PPW penalty (a few percent), but since the WUs will take twice as long to complete, it will affect your performance fraction, so this is only recommended with reasonably fast processors. But so long as you are getting at least one DGromacs, running two clients will not result in any PPW penalty, and could even be beneficial (as MAS showed above).
David

lightning
Posts: 26
Joined: Wed May 05, 2004 8:18 am

Post by lightning » Thu May 27, 2004 8:02 am

Why was this limited to only DGromacs??

dasman
*Lifetime Patron*
Posts: 485
Joined: Thu Jan 08, 2004 10:59 am
Location: Erie, PA USA

Post by dasman » Thu May 27, 2004 8:12 am

It had to do with how the Dgros utilized SSE2 when they were first introduced -- they were very inefficient. It's since been fixed and this thread doesn't really apply anymore.

Note that this thread is completely different than running 2 instances on an HT box - that you always want to do.

Dave

lightning
Posts: 26
Joined: Wed May 05, 2004 8:18 am

Post by lightning » Thu May 27, 2004 8:36 pm

Cheers dasman, thanks for the reply!

Post Reply