Page 2 of 3

WTF?!?

Posted: Mon Oct 11, 2004 10:15 pm
by Edward Ng
Okay, I got my K8 board in and am getting seriously wonky readings, here.

First of all, the readings are clearly way off.

Second of all, the behavior of the reading is just plain wonky.

I see 37-40C idle. Load is 49C, period. The weirder thing? It fluctuates from 37-40C during idle, and then once I put a load on it, it blasts up to 49C in literally 2-3 seconds and just sticsk there for good (until I turn off the loading software, CPUburn).

CnQ is not on, Thermal throttling IS on, because with the stock BIOS on this board, disabling thermal throttling renders a no-POST condition. It's set to throttle down to 50% at most. Fan was at 12Volts during all testing so far.

This doesn't make any sense at all. How can it blast up to 49C in the first few seconds of load and stick there for whatever amount of time I like, be it 5 minutes or an hour, without moving? I stop CPUBurn and the temp drops like a rock within 3 seconds back to 42C or so, and then within another minute is down below 40C again, but it is almost always from 37-39C during idle.

The cooling is done, during these tests, with an XP-90 and a 12volt 92mm M1B. Fan was doing ~2400-2450rpm the whole time.

I need to speak to DFI about this.

Anyone else with a K8 care to chime in?

-Ed

EDIT: I want to add that these are the temperatures as reported by the ITE hardware monitor software that came with the board. I played around with the fan-speed-by-temp setting and the fan ramping behavior was much more consistent with how temperatures change in reality.

My guess at this point is that the ITE software is what's screwy, but its report seems to concur with the numbers I see inside the BIOS (it was reporting an idle temp. of ~39C while I was in the health monitor page). Does anyone have any ideas? Motherboard monitor does not support my board, but my guess is that it, as well as Fan Speed, will report the same as the BIOS and as the ITE monitor.

This is not good--it likely calls for a BIOS fix.

Posted: Tue Oct 12, 2004 1:32 am
by Jan Kivar
Stop the fan while running CPUBurn - temps should go up. If not, the BIOS fix is your best bet.

Cheers,

Jan

Posted: Tue Oct 12, 2004 8:50 am
by GlassMan
Your temps are not as wonky as you might think Ed. The very fast changes are to be expected with the in-die diode. Try speed fan graphing, it will probably show much more fluctuation. (MBM has 1 sec resolution, but is no longer supported, but it may work) The dfi software was (my guess) written to hide it (ie, slow updates). The bios was written to show "realistic" temps with the DH7-CG.
As far as the temp value, if cpuBurn is hotter than prime95 test 1 than your correction will be quite low. Your temps should rise a degree or 2 on a long run.
When you get a chance run a system utility to verify that your cpu speed is as expected at idle. An 800 cpu clock indicates C'nQ is working. The newest bios for my board enables it without a bios entry to control it. If its running let me know as it means the SP2 processor driver has the functions of the AMD c'nq driver.
Have fun with your new board!!

Posted: Tue Oct 12, 2004 9:05 am
by Edward Ng
Fast changes, fine; stabilizing at max peak temp in under 10 seconds, no.

If chip thermals worked like this, my life as a heatsink reviewer would be 1000 times easier.

The ITE software DFI includes with the board updates quicker than once/second; more like every 0.5 seconds. This is why the behavior seems incredibly strange to me; what it does, makes no sense. No part of the die of any CPU can go from idle to maximum peak load temp in under 10 seconds. I'm sure you agree. If it takes up to 30 minutes to stabilize even the slowest portion of a Pentium 4's die, there is absolutely, positively no way that the quickest to stabilize section will do it in under 10 seconds. I tested it last night; it rose from the idle of 38C to 49C in ~5 seconds flat. I then let it sit to see what it would do; I let it sit for a full hour and it was still at 49C. I kept an eye on it the whole time and it never changed; no 48C, no 50C. 49C for an hour straight. And as I said, the updates are at least every half second. I'll shoot a quick video of it later today when I get home, using my digital camera. You'll see what I see. I'll then shoot another video 30 minutes later, and another 60 minutes later.

Also, you know as well as I do that there's no way a chip will drop from full load temp back to just 3C over ambient in the same ridiculously short 5 second time frame right after removing the load on the chip. I'll shoot a video of this as well and share it with you guys. None of this behavior makes any sense whatsoever.

Btw; I'm not the only one getting this problem. Apparently, the beta BIOSes by Oskar Wu solve the problem (supposedly), so I will give his BIOS a try. The thing that concerns me is how could DFI release a board to the general public with the thermal sensor acting this way, particulary one that is supposedly designed to be the ultimate overclocking board--thermals are critical in overclocking.

-Ed

Posted: Tue Oct 12, 2004 10:35 am
by GlassMan
Hi, again Ed. I never doubted the accuracy of your report. My temps raise and drop 8C between speedfans 3sec reports when pausing and unpausing folding. Very fast.
Either your bios or the dfi monitor is limiting your temp reading to 49C. I'm pretty sure your temperature is greater than 50C and this is why you don't see the fluctuations before the temps stabilize. The temps don't fluctuate until it starts to stabilize as the temp rise is to rapid when you get those gates working. Try speedfan as I guarantee it is not limited (I've seen temps of 89C) to determine if its the bios or program.
I posted earlier about the different temp readings of different AMD64 revisions.

Posted: Tue Oct 12, 2004 10:44 am
by Edward Ng
I'll check out SpeedFan later before trying the beta BIOS...

Thanks, pal.

-Ed

EDIT: But what about the fact that it returns to the idle reading after less than 2 minutes? That still isn't explained by a capped topend reading. Especially considering it fluctuates between 37 to 39C constantly once it gets there, which makes perfect sense. That means, at least, that it's not capped on the low end reading as well.

In the end I hope DFI has a BIOS fix that works right.

Posted: Thu Oct 14, 2004 3:02 am
by burcakb
Glassman,

I used my case temp report of 36C as ambient. This temp is measured - as far as I can tell - by something between the southbridge and the back edge of the graphics card. It stays rock solid and I find it a better estimation of air temperature that goes into the Zalman. My calculations took into account the slight differences in ambient air going in, slight changes in voltage etc. I've reason to believe that a +3C overreporting is the reality.

There's nothing with the temps. I did the tests with VCore at 1.5V. Abit overclocks slightly too. But I run my computer at VCore 1.25V with the fanmate turned all the way down PLUS an inline resistor for extremely low rpms (and this was the case during the test too). With this setup, I average about 54C folding and never exceeded 58C.

Chang,

You're right about the extra degree of freedom. Actually I wouldn't even do such a calculation but Abit has a reputation of overreporting by 10C. And I've been playing with this board for some time and it FEELS like it's overreporting by 10C - even though +3C is probably more close to the truth. I'd actually need a calculation at 500 MHz or so to get a better idea of curvature but the motherboard doesn't allow such low speeds.

As I noted above, I used case temps as intake temps. 36C is not high for the very quiet system I'm running. The case has a push-pull system of two Nexus 120mm fans running at 850 rpm max under load (normally they run at 1100 rpm at 12V). At idle, they turn at around 600 rpm. That's very little airflow. So considering that I've got two hot seagates and a hot Radeon 9700 in there, I'd say I have very good airflow.


Ed, in my case, I noted that the A64 approaches max temps extremely quickly and that in the 30 minutes it takes to stabilize, the last 25 is for the last 2-3C. I got 10-12C rise in 2 seconds, 30C rise in less than one minute. That doesn't explain your case though..

Posted: Thu Oct 14, 2004 3:09 am
by Edward Ng
Well I flashed to the latest BIOS and the behavior looks much better; it does continue to rise a little more, slowly, after hitting a certain point now.

Once I get some time, I'll run the test battery on it. I'll keep the fans spinning the same rates, voltages the same, and just switch the FSB across the following settings:

255
245
235
225
215
205

at a multiplier setting of 9.5X. That should be enough to give me a good sample range. I've tested it at 255x9.5 and it works flawlessly like that.

-Ed

Posted: Thu Oct 14, 2004 12:03 pm
by burcakb
Ed,

On my system, changing the FSB for changing the speed resulted in some wonky test points. Maybe I was too sleepy and made mistakes then but keeping the FSB and changing the multiplier gave more consistent results. just to keep in mind.

Posted: Thu Oct 14, 2004 2:40 pm
by Edward Ng
Then I'll do that!

FSB 250:

9.5X
8.5X
7.5X
6.5X
5.5X
4.5X

That should do, no?

-Ed

Posted: Sat Oct 16, 2004 10:35 pm
by Edward Ng
Okay, ran through the battery of tests. This is how I ran them:

Voltage was stable at 2.0. LDT setting was 266MHz.

Here are the multipliers used, resultant core speed, and temperature rise over ambient I found. All testing was done at ambient of 27C.

9.0X, 2394MHz, 36C
8.0X, 2128MHz, 32C
7.0X, 1862MHz, 29C
6.0X, 1596MHz, 26C
5.0X, 1330MHz, 23C
4.0X, 1064MHz, 19C

Okay, here's the problem. If I throw out my results for 9.0X and 4.0X, every single computation comes out to an underreport of 8C. The problem is: How can that be, if at idle at 4.0X multiplier, the monitor reports a a temperature of 34C?!? Take 8C from that and we're looking at 26C--this temperature is below ambient!!! Whenever you use my results from 4.0X and 9.0X, all the results are between 0 to ~-6, which seems more accurate, but what to go by?!? :cry:

I'm befuddled; this didn't happen when I did it on the P4 testbed; it worked out brilliantly for that board.

I guess I'll have to retest again at a lower, more controlled ambient of 23 or 24C, rather than just leaving the window open to cool the room...

-Ed

Posted: Sun Oct 17, 2004 2:53 am
by GlassMan
Don't feel to bad, I had the same problem with my very different result. If it makes you feel better, your corrected cpu temp is higher than mine!!

More seriously, the modern hsf (heat pipes?) may have to be so powerful that idle temps approach ambiant. Since we have the same cpu, running at the same speed we should have the same corrected temps (55 vs 44) minus load diff and hsf efficiency (my load was p95 test 1.) We would have been very close when I had the TT silent boost, me a couple of degrees higher).

IMO the has been some snake oil in mother board temp readings from the beginning. At least spcr always uses the same board and cpu,
I know my being very happy to have 70C temps sounds nuts, but it beats the heck out of the 80's

Posted: Sun Oct 17, 2004 6:10 am
by Edward Ng
Not for the same speed unless we're running the same voltage (~1.98V). What voltage are you running?

Why are you seeing 70C after correction; er, how is your temp. so incredibly high? The most I saw before correction was 63C, and that was at a higher ambient of 27C, and after a correction of -8C, that figure would drop to 55C. You've got a much more powerful cooler; what fan and voltage are you running on it?

-Ed

Posted: Sun Oct 17, 2004 6:45 am
by burcakb
Err, are we placing too much confidance on temps never going below ambient?

I mean, unlike most other processors, we know that Athlon64s draw really very little power at idle. And we're employing some pretty powerful active cooling. The fan creates a low pressure area over the hottest part of the copper, effectively cooling it below ambient. This would naturally not be possible with a fast processor but when you're running it at a significantly lower speed, a heavy-duty heatsink could perhaps be able to get it slightly below ambient. Half a degree perhaps? How much do we trust the sensing accuracy of the diode anyway? So I'd consider 0.5 to 1C below ambient at very low power settings a possibility.[/u]

Posted: Sun Oct 17, 2004 6:49 am
by Edward Ng
I'm running 2.0volts here, and with an FSB of 266, even at 4.0X multiplier it was still doing 1064MHz.

That combination isn't very low power. :?

Posted: Sun Oct 17, 2004 7:28 am
by Rusty075
I think what we're really seeing with these A64 numbers is the effect of temerature compression.

The temperature reports are non-linear: Ed's °C/W varies from 0.1901 to 0.2257 (~15%), Burcakb's are closer, going from 0.5309 to 0.5903 (11%). When I did my calibration on my testbed the °C/W numbers ranged from 0.22 to 0.23 (4%), and the variation had a ramdon distribution. Since both A64's °C/W trend downwards as the wattage increases, it looks more like a systematic pattern.

One other point that should be raised about this statistical analysis: Don't forget about the resolution of the temperature. It's only accurate to the integer, so you really can't run your °C/W numbers out to 4 significant digits. If you add the error bars to burcakb's graphs you get results that closer to linear.

Posted: Sun Oct 17, 2004 1:48 pm
by GlassMan
That explains it , thought the 2.0# was a typo. And 70's is uncorrected,44 corrected. I have a chaintech, 70-80 is normal, thats why the correction is so big (-26)

Low pressure effects would be balanced by the high pressure side.

I doubt much compression, my idle to load difference is @20C, the highest I've seen. Previous cpu's were closer to 10.
Ed's and my HS's are heat-pipes, that might contribute to the c/w anomaly.

Posted: Sun Oct 17, 2004 1:58 pm
by Edward Ng
So I guess we try this again when the water system is going? :lol: :lol: :lol:

Posted: Sun Oct 17, 2004 2:20 pm
by GlassMan
Gonna try for 2.5v? :D :D

Posted: Sun Oct 17, 2004 2:26 pm
by Edward Ng
Unfortunately this board only goes up to something like 2.1...

But that's okay; what's most important is that the water cooling system is more consistent (albeit slower to stabilize), and allows for more precise results.

Just need Cathar's replacement midplate to actually get to me. I hope it doesn't end up in the same place as pHaestus' block!

Posted: Sun Oct 17, 2004 2:45 pm
by GlassMan
My experience is that the tempurature of over volting hurts more than the voltage helps the overclock. Of coarse water will help that, but I would worry about the life of the cpu at over 1.7v.
Let me know what happened to your block. Too thin jets?

Posted: Sun Oct 17, 2004 2:56 pm
by Edward Ng
Read this thread, beginning at post #1358. Read only my posts and Cathar's posts.

You'll have to pay attention primarily to the pictures of the midplate, the jets in particular; Stew (Cathar) explains the problem, and we discuss them and some other things. He has decided to send me a replacement and I send him my bad one for fixing. Actually, he already send the replacement out, and I'm hoping to receive it soon, rather than later. Still, you'll have to read a bunch of that thread to get a clear understanding of exactly what's wrong.

-Ed

Posted: Wed Oct 20, 2004 5:06 am
by GlassMan
Just another bump in the correction problem. I have a thermister on the side of my cpu that reads of 43.4C and the diode currently reads 70C with a correction of -26 that would indicate my temps(d) = 46C . It was my understanding that a thermistor on the side of the heat spreader would read a minimum of -20C. The thermister reads 55C at 80C(d) (temps stabilized)

Posted: Wed Oct 20, 2004 5:20 am
by Rusty075
Thermosistors in contact with the IHS are usually much closer to the CPU temp than 20°'s. I'd say that your 3-4° offset with the sensor is pretty typical.

Posted: Wed Oct 20, 2004 12:42 pm
by GlassMan
I did some research on this to learn how to place the thermosister on the cpu. One of the sites I found useful (sorry I don't have the site bookmarked) explained it this way. (any errors or distortion are mine) Temperature is the average heat at the point of measurement. The furthur you get from the source the lower the temperature will be. The end closest the heat source will be warmer than the other end. ex. a bar of iron in a fire will only get red hot at the end in the fire. The thermal paste (on die) and heat speader are distance. The center of the heatspreader (case) will be cooler than the die, and the center of the case will be hotter than the edge. From his experience he estimated that the drop was 10C each. This included drilling heatsinks down the middle per Amd guidelines. Remember the heat spreader is being cooled by the heatsink all the way to the edge. (His main point was that cpu temps would always be approximations)
Whether his estimate is useful for current cpu's, Amd's current A64 processor power and thermal data sheet lists T(case)max as 70C and T(die)max 95C for the otherwise identical desktop and mobile cpu's.

Posted: Thu Oct 21, 2004 7:45 am
by GlassMan
This may well be the source of the 20C temp difference figure
http://www.analog.com/library/analogDia ... peratures/

Ever wonder how hot a cpu can get? http://www.pantherproducts.co.uk/weblog ... ooling.zip

Posted: Fri Apr 01, 2005 4:39 pm
by ilh
What are A64 users using to compute CPU power (wattage) so you can compute ºC/W values? Or, is everyone just assuming linearity and just computing the offsets?

The two referenced power applications do not support A64.

Posted: Fri Apr 01, 2005 5:56 pm
by Tibors
CPU Power by Kostik gives you the possibility to add new processors. (Second button on the bottom.) I got some wattage numbers for the Winchester chips from an article at Xbit Labs. Those were measurements including the mobo voltage regulators. So I multiplied them by 0.9 (maybe 0.85 would have been better, but it doesnt really matter for this). And added the following section:

Code: Select all

[AMD Athlon 64 Winchester]
Athlon 64 Winchester 3000+ (1.4); 1.4; 35.9; 1800
Athlon 64 Winchester 3200+ (1.4); 1.4; 38.7; 2000
Athlon 64 Winchester 3500+ (1.4); 1.4; 42.8; 2200
If you have an A64 with another core you have to hunt for some numbers yourself. But as Rusty pointed out in the article the number you start from doesn't really matter. You are not looking for the exact ºC/W values, but for the linearity. So you can just take any proc from the list in CPU Power by Kostik and plug in your frequency numbers.

Posted: Fri Apr 01, 2005 6:17 pm
by ilh
Thanks!

Posted: Thu Jun 08, 2006 7:23 am
by computergeek22
Pardon me for bringing up an old topic but I've recently noticed my overclocked opteron 165 temps on my a8r32 and they seem extremely high and unusual according to some of my friends that have an overclocked opteron 170.

After performing the mentioned tests in the article, it seems my mobo is misreporting temps by about 12.9C. For the load temps this seems right. (57C load on mobo -> ~44C load actual) However, for the idle temps the actual difference cannot possibly be 12.9C. As 12.9C would result in my idle temp of the processor according to my mobo (40C) to be 27C which is less than my ambient case temperature which according to my vantec nxp-101 is ~32C.

I assumed the correlation was linear as of the four temps I recorded the C/W went from 0.170 -> 0.173 -> 0.175 -> 0.183 (not sure if these numbers are exactly correct as I'm pulling this out of memory.)

Even in the odd case when temps in CA are quite high, my load temps for my processor hit 62C once in a blue moon while using prime95. However, even at these high temperatures my computer seems to be very stable. I've run that program that gives you the Tmax or Tcase or something like that and for my processor, the value was 49C.

Any help would be appreciated...

EDIT:

I checked my temps in the morning when the temps in my case and my processor were most likely at the ambient of my room. It turns out that at idle the mobo is misreporting my case temperature by approximately 3-4C (nxp101 says 27C while my mobo reports 31-32C). This is at ambient room temp of about 24C taken at the other end of the room.

So it seems at idle my mobo is misreporting by a lower number than at load. This seems extremely unlikely and wrong.