Help with unknown memory instability problem

Our "pub" where you can post about things completely Off Topic or about non-silent PC issues.

Moderators: NeilBlanchard, Ralf Hutter, sthayashi, Lawrence Lee

Post Reply
dshao1
*Lifetime Patron*
Posts: 22
Joined: Fri Feb 18, 2005 5:44 pm
Location: Beijing

Help with unknown memory instability problem

Post by dshao1 » Tue May 20, 2008 5:58 pm

Hello All,

I have a system based on some new and old components:

Asus K8N-DL motherboard
2 x Opteron 285 (4 cores) running at stock 2.6Mhz, undervolted to 1.2 volts
4 sticks of Kingston 1GB DDR-400 ECC registered memory
3 x 2.5 inch hitach 7K200 SATA drives
AMD 3870 video card
Enermax modu82+ 625W PSU
Antec P182 case

The system has an instability problem that I'm having a real difficult time trying to work out.

The system can simultaneously run 4 instances of Prime95 v25.6 and rthdribl together for over 24 hours without any errors. It can also run memtest for over 24 hours as well. All the temps and voltages seem fine (CPUs at 55C under load, with 25C ambient)

But every time I try to run the memory specific benchmarks in PCMark04, PCMark05, or SANDRA - the system crashes and reboots. Running PCMark04 and PCMark05 overall benchmark scores is not a problem. I am using Windows XP SP2. I know windows XP can't use all 4GB of RAM, but it's what I had available. Could this be part of the problem?

Any idea where I should look to find the root cause of the memory instability? I haven't crashed yet when running applications or games, but am concerned that something is not quite right and may cause problems at the worst possible moment.

Any insight would be greatly appreciated. Thanks in advance!

Best Regards, Dan

NeilBlanchard
Moderator
Posts: 7681
Joined: Mon Dec 09, 2002 7:11 pm
Location: Maynard, MA, Eaarth
Contact:

Post by NeilBlanchard » Tue May 20, 2008 6:50 pm

Hello,

I would try bumping up the CPU voltage a smidge -- the memory controller is on there, remember.

Plissken
Friend of SPCR
Posts: 235
Joined: Tue Dec 19, 2006 6:22 pm
Location: Seattle

Re: Help with unknown memory instability problem

Post by Plissken » Tue May 20, 2008 7:01 pm

dshao1 wrote:But every time I try to run the memory specific benchmarks in PCMark04, PCMark05, or SANDRA - the system crashes and reboots. Running PCMark04 and PCMark05 overall benchmark scores is not a problem.
Wouldn't the overall benchmark also run the memory benchmarks? If the memory-specific benchmark includes extra tests, research those to find the problem. Since you are stable with Memtest and Prime95, your memory is probably fine. I'm guessing it's a software problem (like PCMark is confused with 4 cores, or something). You could change to default CPU voltage, make sure RAM has enough juice, and try the tests again. If it were me, I wouldn't worry.

dshao1
*Lifetime Patron*
Posts: 22
Joined: Fri Feb 18, 2005 5:44 pm
Location: Beijing

Steps taken so far, as related to the suggestions made

Post by dshao1 » Tue May 20, 2008 8:25 pm

Hello Neil and Plissken, thanks for replying so quickly!

Neil, the same problem occurs when I don't undervolt and leave it at it's default of 1.4 volts. I've also run with default CPU voltage and even increased memory voltage from a default of 2.6V to 2.7V but same problem still persists.

Plissken, PCMark04 and 05 don't run all the memory tests when coming up with benchmark scores. I go into the test options and select all tests within the "Memory Test Suite". The test that it fails on is not always the same (Most of the time it fails at "Memory Read - 8KB", other times it goes farther into testing and fails at "Memory Write - 16KB")

I'm hoping it is a software problem with PCMark having an incompatibility with my system. But when I get crashes running SANDRA's memory bandwidth testing as well, I tend to look at it first as a problem in my setup. (Prime95 V25.3 used to crash with a Windows "Application Error" related to a memory location being unreadable, but by updating to Prime95 V25.6 and changing nothing else - things run smoothly).

Hopefully I'll never run into a problem besides benchmarking.

Best Regards, Dan

Plissken
Friend of SPCR
Posts: 235
Joined: Tue Dec 19, 2006 6:22 pm
Location: Seattle

Post by Plissken » Tue May 20, 2008 8:53 pm

Have you tried the benchmarks with 1 stick of RAM?
I took a look at the manual for the K8N-DL. There are a lot of memory-related BIOS settings you can play with.
You are right - with Sandra in the mix it points more toward system settings or hardware glitch.

dshao1
*Lifetime Patron*
Posts: 22
Joined: Fri Feb 18, 2005 5:44 pm
Location: Beijing

Thanks Plissken, I'll try 1 stick when I get home

Post by dshao1 » Tue May 20, 2008 11:06 pm

Thanks Plissken, I'll try 1 stick when I get home.

I believe your approach is correct. I have to admit I didn't want to do it because everything is already installed in the case, with 2 Ninja HSFs, making a couple of the RAM sticks difficult to get to. But one has to do what one has to do.

I'll post the results afterwards. Thanks for taking to the time to look through the K8N-DL manual (I understand it is not that common of a MB) and your suggestion.

BR, Dan

Arvo
Posts: 294
Joined: Sat Jun 10, 2006 1:30 pm
Location: Estonia, EU :)
Contact:

Post by Arvo » Wed May 21, 2008 12:13 pm

I'm almost sure that you can't solve your problem.

I cannot give links, but about year ago my friend had similar problems; after exhaustive research on internet and on PC-Mark forums it was clear that this is PC-Mark specific problem with nForce4 chipsets. It may be similar problem with other behcmarks too - they tend to program memory controller in such way that this crashes nForce4 based system while performing some kind long memory block transfers.

All this doesn't create any problems for everyday PC usage.

dshao1
*Lifetime Patron*
Posts: 22
Joined: Fri Feb 18, 2005 5:44 pm
Location: Beijing

Post by dshao1 » Wed May 21, 2008 5:25 pm

Arvos - thanks for the info. I tried googling to see if others had this type of problem but didn't find any. I'm unhappy to know that this problem exists for others as well.

I also think this may be a MB problem, since I have an almost identical system that runs everything fine (everything in this other system is exactly the same, except it uses 2 Opty 275 CPUs instead of 2 Opty 285 CPUs)

I figured out at least part of the problem, though I don't know which solution will be more satisfying.

I started testing with 1 stick of RAM at a time and all 8 sticks run fine when they are plugged in by themselves (I've been using 4 sticks of 1GB RAM, but also have 4 sticks of 512MB RAM on hand).

After testing all the various configurations I could think of (8 sticks of RAM, 3 channels, and 6 slots), I discovered that I get these instabilities when I use 2 sticks of RAM in one channel.

The best option I have in order to run all testing/benchmarks successfully is to use 3 sticks of 1GB RAM, one in each channel. I still get NUMA, but now have 2GB for CPU1 and 1GB RAM for CPU2. I don't know how or if this configuration will affect actual performance at this time (I think my MB's 3-channels of memory is a bit strange). My other option is to go back to my original configuration and have 2GB for each CPU, using 2 channls, but I don't know if that really matters since Windows XP 32-bit can only recognize 3.25GB RAM anyway.

If any of the more knowledgeable forum members are familiar with dual CPU systems and can provide comments, it would be appreciated. If not, at least I have solutions and can narrow my options down to 2.

Thanks everybody for the help!

BR, Dan

widowmaker
Posts: 239
Joined: Sat Mar 29, 2008 7:05 pm
Location: Toronto Ontario

Post by widowmaker » Wed May 21, 2008 6:30 pm

I believe XP only allows a maximum of 1gb of memory per service, so your benchmarks may be doing something out of the ordinary to utilize more than 1gb. I'm not sure how the benchmarks work, but I do know that you can run a boot time memory test to test all your ram. There are several tools out there you can use, but I haven't tested enough to say which is best. Just a thought.

dshao1
*Lifetime Patron*
Posts: 22
Joined: Fri Feb 18, 2005 5:44 pm
Location: Beijing

Post by dshao1 » Wed May 21, 2008 8:10 pm

Hi Widowmaker,

I started out using Memtest and it ran without any errors for over 24 hours when I had the original 4 x 1GB configuration. Through the web, I got the impression that Memtest was the "standard" that most people used for boot up memory testing.

I'll look into some of the others. Thanks for the suggestion.

BR, Dan

dshao1
*Lifetime Patron*
Posts: 22
Joined: Fri Feb 18, 2005 5:44 pm
Location: Beijing

Post by dshao1 » Thu May 22, 2008 9:26 pm

I cleared the CMOS' for both computers and now they are consistent in terms of PCMark crashing while SANDRA now runs without a hitch when 4 x 1GB sticks of RAM are installed.

Now that the only issue is back to PCMark, I'll take it as a SW incompatibility, as Arvo and Plissken also guessed it may be. I'm going to stay with the 4GB RAM configuration. I've also added the boot.ini, /3GB switch, that I learned about through Tzupy's message in the Ultimate Silent PC thread (also listed under the Off Topic forum category)

The only thing I don't understand is why the system with 2xOpty 285s has a bandwidth of 9945 MB/s and the system with 2xOpty 275s has a bandwidth score of 6436 MB/s. I would not expect this much of a difference from a slower set of CPUs (2.2Ghz vs. 2.6Ghz). Everything else in the 2 systems are identical clones. NUMA is also working on both systems.

But that is a question for another day, I'm not going to bother with it now as long as both systems are running smoothly.

Thanks again for the help provided. BR, Dan

sjoukew
Posts: 401
Joined: Mon Nov 27, 2006 6:51 am
Location: The Netherlands (NL)
Contact:

Post by sjoukew » Sat May 24, 2008 3:01 am

Maybe the problem is in the NUMA part of your system. Good drivers for your chipset / cpu etc. can help to solve the problem maybe. Try different drivers, update your software etc.
Maybe those benchmarks can't hande NUMA aware systems...
Can your Operating System hande NUMA correctly?
Try another, linux from an live cd or something like that, and see if tests run fine in linux.
I don't know these answers, but I have those questions, therefore I am posting, in the hope that it helps solve your terrible problem.

dshao1
*Lifetime Patron*
Posts: 22
Joined: Fri Feb 18, 2005 5:44 pm
Location: Beijing

I have ubuntu and can try to run some benchmarks using that

Post by dshao1 » Sat May 24, 2008 4:02 am

Hi sjoukew, thanks for the suggestions. I am using 32-bit XP SP2 and SANDRA says NUMA is working properly, 2 nodes. I didn't even think that there might be other drivers for NUMA under XP. I'll do some searching.

I have Ubuntu 8.04 and can install it on both systems, then do some benchmarking. I'm a relative newbie when it comes to Linux, so I'll have to find out what type of memory benchmarking and testing programs there are for Linux.

It's not an urgent priority anymore as I believe both systems will run reliably for now while I focus on other things (work, earthquake relief support, etc.).

But I'll be back to modifying, tweaking, and adjusting these systems later on and will be happy to post results when I get them. I can't really rest until I know everything runs as best as they can. Thanks again for your help and support!

BR, Dan

Post Reply