You are here

Athlon 64 for Quiet Power

June 15, 2004 by Bryan Cassell

A detailed discussion of the AMD Athlon 64 Processor as it pertains to silent computing by Bryan Cassell, with rejoinders by Russ Kinder and yours truly. Bryan is a new contributor to SPCR who is no beginner to computing or PC silencing, and works as a software programmer. Enjoy!

— Mike Chin, Editor of SPCR


This article deals with the Athlon 64—probably the most talked about, yet least understood processor of the current generation. I’ll deal with the facts and myths surrounding the Athlon 64, and share some of my personal experiences with this exceptional CPU.

So what makes the Athlon 64 special? How is it different from any other x86 CPU?



The A64's copper heat spreader is a relief for those concerned about core damage (oft seen with the original Athlon and Athlon XP CPUs).

Architecturally, there are two main differences that set the Athlon 64 apart from its predecessors and its competition:

Integrated Memory Controller

The first major difference is the memory controller. For as long as x86 PC’s have existed, the memory controller (the logic chip that actually communicates with your RAM) has been located on the motherboard, usually as a part of the chip that we now call the northbridge. In this configuration, when the CPU wants to access a piece of memory, it sends the memory read commands over a high-speed connection (known as the front side bus) to the memory controller. The memory controller then interprets those commands, fetches the data from your RAM, and sends the data back across the front side bus to your CPU.



copyright © www.cpuid.com

The Athlon 64 has completely bypassed this mechanism and placed the memory controller directly on the CPU. The CPU itself has a direct, high-speed connection to the RAM. When the Athlon 64 wants to access a piece of memory, it simply fetches the data itself directly from memory. Because of this direct path, the latency from when the data is requested to the time that it is received is significantly reduced.



copyright © www.cpuid.com

64-bit Extensions

The other, and most touted, difference with the Athlon 64 is the fact that it is a 64-bit processor (hence the name Athlon 64). What it actually means to be a “64-bit processor” is a subject of much debate, but in the case of the Athlon 64 it simply refers to the x86-64 extensions that the CPU possesses. These extensions provide, among other things, a set of sixteen 64-bit general purpose registers (rather than the eight 32-bit GPRs in the x86 instruction set), as well as a set of instructions to manipulate data in 64-bit chunks and address memory using 64-bit pointers. If you have no idea what I just said, don’t worry about it, just understand this:

The 64-bit capabilities of the Athlon 64 are merely extensions to the existing x86 instruction set. The Athlon 64 supports, 100% natively, all existing x86 code. The addition of x86-64 extensions is the same thing as the addition of MMX or SSE—it is simply a new feature that will probably provide some performance benefits as soon as software is written to take advantage of it.

Many people have asked me something along the lines of, “Why buy a 64-bit CPU if there is no 64-bit version of Windows out yet?” Well, when Intel released the first SSE capable CPU did you also say “Why buy an SSE capable CPU if there is no software that supports SSE yet?” I would certainly hope not. In time the 64-bit extensions may turn out to provide great benefits (in much the same way that SSE has), but until then the Athlon 64 should be judged for what it is: A great 32-bit CPU which happens to have 64-bit extensions.

PERFORMANCE

The Athlon 64 is most assuredly a fast 32-bit CPU. Benchmarks can be found at every hardware site on the web, but just in case you’ve missed them, I’ll summarize: In all but a very few benchmarks the Athlon 64 stomps an equivalently priced Pentium 4. Not only that, but the Athlon 64 simply feels faster under normal use.

You’re probably asking, “What do you mean it feels faster?” Well, for comparison I have a Pentium 4 2.8 GHz machine at work, and an Athlon 64 3000+ machine at home. They both have 1GB of the same ram and 80gig 7200rpm 8mb cache hard drives. Under light use—web surfing, writing documents, writing code, etc—the Athlon 64 is just snappier. Windows, menus, animations, etc. respond quicker and feel faster on the Athlon 64 machine. I’m not alone in my feelings either. In a recent forum post our own Mike Chin writes:

“... my A64-3200 system [is] right next to my main P4-2.8C rig. Win XP Pro on both. No contest: The A64 runs faster & cooler. I don't mean benchmarks, I mean using the full range of apps I use daily -- Photoshop, Dreamweaver, Acrobat, Adobe InDesign, a bunch of web design tools... I've been gradually migrating to the A64 -- to turn it from backup machine to main machine.”

Most of this increase in speed and responsiveness is due to the integrated memory controller. Because of the Athlon 64’s memory controller, the delay from when an application first requests data to when it receives the first pieces of that data is much smaller than with a traditional northbridge-contained memory controller, thus increasing the perceived responsiveness of the application. To be fair, though, AMD is not the only one trying to increase the responsiveness of every day computer use. Intel’s HyperThreading helps to achieve just that.

HyperThreading is Intel’s name for their implementation of Simultaneous Multi-Threading (SMT). SMT is simply a way of allowing more portions of a CPU to be active at one time. To achieve this, an SMT-enabled CPU is actually exposed to the operating system as two CPUs. This way, the operating system can schedule two threads to run simultaneously. The CPU then sorts out which thread is actually running at any given time and in some cases can run portions of each thread in parallel.

Many people think that, because of HyperThreading, multi-tasking should be much smoother on a Pentium 4 than on an Athlon 64. While it is true that HyperThreading can provide a large benefit in multitasking situations, in my experiences the Pentium 4 is still significantly less responsive than the Athlon 64, even in situations involving heavy multitasking. I do software development work and most of the multi-tasking I do is during a compile. My Athlon 64 system is noticeably more responsive during a compile than my Pentium 4 system at work. Add to that the fact that the same code that takes 15 minutes to compile at work only takes 8 minutes to compile on my home machine, and you have a much more pleasant computing experience.

HEAT & POWER

Most people assume that all this computing power comes at a price — heat. Probably the biggest misconceptions about the Athlon 64 are in regards to its power consumption. The most often quoted number is AMD’s listed 89W Thermal Design Power (TDP). Many see that number and assume that the Athlon 64 dissipates as much heat as the Pentium 4, which Intel lists as having similar TDP numbers. The reality is that AMD and Intel arrive at their TDP numbers in very different ways.

In the next section I will refer to the following documents, anyone interested in the thermal details of these CPUs should download and read these documents in their entirety:

• AMD Athlon 64 Processor Power and Thermal Data Sheet (http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30430.pdf)

• Intel® Pentium® 4 Processor with 512-KB L2 Cache on 0.13 Micron Process and Intel® Pentium® 4 Processor Extreme Edition Supporting Hyper-Threading Technology(1) Datasheet (http://www.intel.com/design/pentium4/datashts/29864312.pdf)

• Intel® Pentium® 4 Processor on 90 nm Process Datasheet (http://www.intel.com/design/Pentium4/datashts/30056102.pdf)

Here is a summary of TDP data from those documents:

TDP for for P4 and Athlon 64 processors
Model
Northwood P4
Prescott P4
Athlon 64
2.8 / 2800+
70
89
89
3.0 / 3000+
82
89
89
3.2 / 3200+
82
89 / 103
89
3.2EE
92
-
-
3.4 / 3400+ / 3500+
89
103
89
3.4EE
103
-
-
3700+ / 3800+
-
-
89
FX51
-
-
89
FX53
-
-
89

All figures rounded off.

As you can see Intel lists a different TDP for each CPU, while AMD lists the same TDP for all their CPUs. Obviously the Athlon 64 3800+ must draw significantly more power than the 2800+, so the only logical conclusion is that the 2800+ must draw significantly less than 89W. Even so it would still appear that, according to these charts, the Athlon 64’s power consumption is fairly close to the Pentium 4, right? To find the answer to that question we have to dig a bit deeper into the referenced documents.

How does AMD define TDP?

“Thermal Design Power (TDP) is measured under the conditions of TCASE Max, IDD Max, and VDD=VID_VDD, and include all power dissipated on-die from VDD, VDDIO, VLDT, VTT, and VDDA.”

This means that TDP, as defined by AMD, is measured at the maximum current the CPU can draw, at the default voltage, under the worst-case temperature conditions. This is the maximum power that the CPU can possibly dissipate. Intel, however, has a different definition.

How does Intel define TDP?

From the Intel Datasheet for Northwood CPUs:

“The numbers in this column reflect Intel’s recommended design point and are not indicative of the maximum power the processor can dissipate under worst case conditions.”

And from Intel’s datasheet for Prescott CPUs:

“Thermal Design Power (TDP) should be used for processor thermal solution design targets. The TDP is not the maximum power that the processor can dissipate.”

And the most telling quote of all, contained in both documents:

“Analysis indicates that real applications are unlikely to cause the processor to consume maximum power dissipation for sustained periods of time. Intel recommends that complete thermal solution designs target the Thermal Design Power (TDP) indicated in Table 26 instead of the maximum processor power consumption. The Thermal Monitor feature is intended to help protect the processor in the unlikely event that an application exceeds the TDP recommendation for a sustained period of time.”

What this means is that Intel’s TDP is actually lower than the maximum power dissipation of the processor (and as you’ll see later, it can be significantly lower). This is in stark contrast to AMD’s TDP numbers, which are higher than the respective processor’s maximum power dissipation.

So what is the actual maximum power consumption of these CPUs?

Unfortunately, that’s a hard thing to determine. Fortunately, there have been some recent attempts to do just that.

This is a list of estimated maximum power consumption for Intel CPUs (calculated from Intel datasheets): http://www.cpuheat.wz.cz/html/IntelPowerConsumption.htm On average, these numbers are roughly 10~15% higher than Intel's TDP. Here's the power table again, this time with the Maximum Power from the CPU Heat website linked above added beside the TDP numbers for the P4s.

TDP & Max Power for P4
Model
Northwood P4
Prescott P4
TDP
MAX
TDP
MAX
2.8
70
79
89
100
3.0
82
92
89
100
3.2
82
96
89 / 103
100 / 115
3.2EE
92
101
-
-
3.4
89
101
103
115
3.4EE
103
113
-
-

This is an article (in German) comparing AC power draw of identically configured systems with different CPUs (thanks to jojo4u for the link): http://www.computerbase.de/artikel/hardware/prozessoren/energieverbrauch_prozessoren/

The components they used in all these comparisons were the same, except for the motherboard, which naturally could not be the same:

  • 400-Watt PSU
  • Asus GeForce FX 5900 Ultra VGA
  • IBM 40 GB Desktar 120GXP HDD
  • TwinX1024(RE)-3200LL memory
  • Pentium-4 Asus P4C800-E Deluxe motherboard
  • AMD Athlon 64 MSI K8T Neo-FIS2R motherboard
  • Athlon 64 FX Asus SK8V motherboard
  • Athlon XP Asus nForce 2 400 Ultra motherboard

Some of the most telling charts from this excellent article are shown below. The first is the AC consumption of the various systems at idle. Note the extremely low power of the A64 systems running Cool 'n' Quiet (discussed in detail later)..



© copyright
www.computerbase.de

Here is the maximum AC power consumption while running [email protected] Because the load of this program is almost entirely on the CPU, it best shows the power consumption differences between the various CPUs.



© copyright
www.computerbase.de

The data above paints a very interesting picture.

Intel is listing TDP numbers that are significantly lower than the actual maximum power draw of their CPUs. They are then relying on the fact that most applications barely use the CPU, assuming that it will remain idle most of the time. In the case that an application does max out the CPU for any period of time, Intel relies on their “Thermal Monitor” to automatically slow down the CPU when it becomes too hot to protect it from overheating.

AMD, on the other hand, lists TDP numbers that are significantly higher than the maximum power draw of their CPUs. They also have listed the SAME TDP for every desktop Athlon 64 so far, and I have little reason to believe that future Athlon 64s will have a higher listed TDP (at least for the near future). We still don't have definitive information about the exact power dissipation of each of the Athlon 64 processors, but it is clear that other than the fastest clock models, it is far below the 89W TDP cited by AMD.

It is impossible to deny that the Athlon 64 dissipates considerably less power than even a Northwood Pentium 4, while the Prescott Pentium 4 has a power consumption that challenges even noise-indifferent builders to cool them effectively.

My personal experiences have also shown that the Athlon 64 is a very easy CPU to cool. I’ve used two Athlon 64 3000+ CPUs, one from a very early batch (when the 3000+ was first released) and one from a newer batch (currently in my computer). The first CPU would undervolt to 1.35V (from a default of 1.5V) and my current CPU will under-volt to at least 1.3V (I haven’t tried to go lower). The CPUs were 100% stable at these voltages.

Currently I’m cooling my CPU with a Zalman 7000AlCu, modified with a 92mm Panaflo L1A. With the L1A running at 5V, my CPU at 1.3V, and a case temperature of 27C, my CPU is running at 38C under full load ([email protected]). That’s an ice cold CPU load temperature, and all I have is a 5V Panaflo L1A fan which is virtually inaudible inside my case.

COOL 'N' QUIET

Another feature which may be beneficial to some is AMD’s Cool’n’Quiet (CnQ). This feature automatically underclocks and undervolts your CPU when it is not under heavy load. The exact frequencies and voltages that are used are contained in the previously referenced AMD Athlon 64 Processor Power and Thermal Data Sheet. It is essentially the same technology that has been used in both AMD and Intel mobile processors to maximize battery life in laptops.

When discussing CnQ I'm frequently asked, “So, the Athlon 64 runs cooler with Cool’n’Quiet enabled then?”

  • First of all, that’s not a question ;)
  • Secondly, that all depends on how you use your CPU.

The most important thing to remember with CnQ is that it makes absolutely no difference when your CPU is under heavy load. As soon as the CPU is taxed, it will revert to its stock speed/voltage settings. This means that CnQ is of no benefit to all the diehard members of SPCR’s [email protected] team. If, however, your computer is rarely running heavy loads, CnQ can indeed bring significant power reductions.

Many Athlon 64 users have found that combining CnQ with their motherboard’s automatic CPU fan controller can bring good results. In this case, the motherboard will slow down your CPU fan when the CPU temperature is low, and speed it up if the CPU heats up. With CnQ, under light load, the CPU will remain very cool, and some motherboards may even turn the CPU fan completely off. This behavior is highly dependant on the specific motherboard, however, and results are somewhat varied. The fan/thermal control utility SpeedFan can also work in conjunction with CnQ on many motherboards; you will have to experiment to find out how the combination works for you.

Other Athlon 64 users have had great success when combining CnQ with custom under-volting. This SPCR Forum thread details a few of those experiences. One Athlon 64 user was running his CPU at only 0.85V at 1 GHz!

ATHLON 64 MOBILE

Recently, AMD released a series of Athlon 64 Mobile CPUs that run on the same socket 754 platform as their desktop counterparts, but at a lower default voltage. The details of these CPUs can be found in the previously referenced AMD Athlon 64 Processor Power and Thermal Data Sheet. There are basically two lines of Athlon 64 mobile CPUs:

  • the 62W mobile which run at a default voltage of 1.4V
  • the 35W mobile which run at a default voltage of 1.2V

The few reports from users of these CPUs have been positive. It would seem, though, that there are still a few kinks to work out with regard to motherboard support. Depending on your motherboard, these CPUs might not “just work”. There is a report in this forum by a user who is running his 35W 2800+ at 2.25 GHz and 1.25V. He can’t go higher at the moment due to the limitations of his motherboard. Still, 2.25 GHz at 1.25V is VERY impressive. I fully expect that as soon as BIOS updates work out any motherboard kinks, these CPUs will become the undisputed champions of powerful, quiet computing.

THE SOCKET ISSUE

One issue with the Athlon 64 right now is that AMD seems to be having a bit of an identity crisis with sockets. Currently, there are 3 different sockets for Athlon 64 CPUs:

Socket 754 – Until recently, this was the only socket for non-FX, non-Opteron Athlon 64 CPUs. It supports single channel unbuffered memory, and currently has the largest range of processors available for it.
Socket 939 – This is supposedly socket 754's replacement for higher end Athlon 64 CPUs. It supports dual channel unbuffered memory, and currently only has 3 processors available for it—the 3500+, 3800+, and FX-53.
Socket 940 – This is the only socket that Opteron CPUs are available on. It supports dual channel registered memory. The FX-51 and FX-53 were made in socket 940, but future FX CPUs will only be available for socket 939.

Since socket 754 is already slated to be discontinued, many are wondering whether it is better to get a system based on socket 939 or socket 754. The answer will vary on a case-by-case basis, but these are the major things to consider:

• Dual channel memory makes little performance difference with the Athlon 64. The only place you will really see a performance difference is in synthetic memory benchmarks. Most real applications just do not use memory in a way that can benefit greatly from dual channel memory.

• Socket 939 CPUs have half the L2 cache of their socket 754 counterparts (512k vs. 1mb). The dual channel memory supposedly makes up for the performance difference and then some, giving socket 939 CPUs a slightly higher performance rating at the same clock speed.

• AMD’s top of the line Athlon 64 will probably not be available in socket 754 a year from now (though nobody knows for sure). For those of you who plan on upgrading only the CPU more than a year from now, this could be an issue. For now, however, AMD offers its top of the line CPUs (except for the FX line) in both 939 and 754 sockets.

• You cannot buy a socket 939 CPU right now [June 14, 2004] for less than ~US$480. The 3500+ is the cheapest CPU offered for socket 939. In contrast, the Athlon 64 2800+ for socket 754 is available for as little as ~US$170.

• Athlon 64 mobile CPUs are only available in socket 754.

• Socket 939 CPUs cost more at the same clock speed than their socket 754 counterparts. The performance differences between the 3400+/3500+ and 3700+/3800+ are negligible, yet the 3500+ costs ~US$80 more than the 3400+ and the 3800+ costs ~US$10 more than the 3700+. [June 14, 2004 pricing]

• Initially, socket 939 motherboards will be more expensive than socket 754 boards, and there will be few value-oriented socket 939 motherboards.

Having built a socket 754 Athlon 64 machine over 6 months ago (well before socket 939 was released), I was recently asked the following question: “If you were to build an Athlon 64 machine now, would you do it differently than the one you already have?”

I can honestly say NO. I would definitely go with socket 754 over socket 939.

I can only remember one time in my life that I have ever upgraded a CPU without upgrading the motherboard as well—I went from a Pentium 133 to a Pentium 200. Since then my CPU upgrades (which have been numerous) have always involved a new motherboard. If I were to build a system now I would choose the system that offers the best value NOW, not worrying that certain CPUs might or might not be released for that socket in the future. The only thing I might consider doing differently is using an Athlon 64 mobile CPU. The mobile CPUs offer the same performance as their desktop counterparts, but at astoundingly low power levels. The less adventurous, though, would be advised to wait until motherboard compatibility issues settle down.

My Athlon 64 machine has been both the most powerful and the quietest system I have ever used. Where AMD has faltered with marketing, they have triumphed with technology. Hopefully this article has provided some valuable information for those looking for a computer with top performance and minimal noise.

Russ Kinder says

Most of the points I had planned on writing, Bryan already covered very nicely in his portion. So... how about a few semi-random thoughts?

*

One aspect not touched upon in Bryan's description of the A64's on-die memory controller is its impact on motherboard design. Because the memory controller requires equal length traces to each of the RAM pins, and the traces need to be as short as possible to reduce latency, there is much less variety in motherboard layout than with designs for previous processors. These requirements make it unlikely that you will ever see a BTX form-factor A64 motherboard. BTX shoves the CPU up to the edge of the board, and turns the RAM at 90° to it, along the other side of the motherboard. Not an impossible arrangment, but nowhere close to ideal.

Some AMD engineers have been quoted as saying they will not be supporting BTX. If Intel does manage to bully BTX onto the Intel motherboard market, you may be forced to buy a completely new case and PSU if you want to switch from an Intel to an AMD product.

*

While current 32bit benchmarks show the A64 to be performing on par with the equivalent P4, the efficiency increases gained by using the A64 with a 64bit OS and apps will undoubtedly increase the work/clock-cycle advantage that the A64 already holds. It's really no different from the performance that the P4 gained once SSE/SSE2 implementation began to appear.

*

One other method for ballparking the TPDs of AMD and Intel is to look at the information that they give to motherboard makers, specifically the Core Voltage (Vcore) and the Maximum Core Amperage (Acore). Motherboard makers need accurate data for these limits to design the voltage regulation circuitry of the board. By multiplying the core voltage by the maximum amperage you can get a pretty good guess at what the Maximum Power (Wmax) that the CPU could draw from the motherboard:

Vcore
Acore
Wmax
TDP
XP2500 Barton
1.65V
41.4A
68.3W
68.3W
A64 2800-3800+
1.5V
57.8A
86.7W
89W
Intel P4 3.4
1.55V
71.6A
111W
89W

*Barton values included for comparison

This table confirms Bryan's comments: Intel underestimates maximum power dissipation with their TDP for P4 while AMD gives the maximum power for the fastest processor for their Athlon 64 TDP.

*

AMD has stated pretty clearly that 939 is the desktop socket "for the foreseeable future". Socket 754 will continue on, but not indefinitely. Even the new value Sempron line will eventually transition to 939.

Mike Chin says

One of my first SPCR articles was Silencing a P4-1.6, a real project to make a very quiet and powerful (for its time) work PC. It was a great overclocker even at or slightly less than standard Vcore. I've liked Northwood P4s since, and have had good experiences with 1.8, 2.0, 2.6, and 2.8GHz versions. Despite the misnomer, the "heat spreader" made the P4 highly resistant to damage caused by user errors with enthusiast heatsinks. Up to the P4-2.8, I've have little trouble making nearly silent, powerful PCs with them.

The Prescott and the >3Ghz Northwoods change everything. A brief experience with a P4-2.8 Prescott made it very clear that silencing P4s based on this core is a serious and probably expensive challenge. They run much hotter than comparable Northwood cores, and as Bryan's research bears out, much hotter than equivalent speed Athlon 64s as well.

In contrast, the Athlon 64-3200+ I've been playing with is a very cool character. Casual experiments told me this A64 system ran cooler than my P4 main rig, but it was not until I did A/B comparison on AC power dissipation that the difference hit home.

The ARM Systems StealthPC Powerhouse P4-3.2 system I reviewed had a max AC power draw of 235W under load. Because the efficiency of the Zalman PSU in that system is ~75%, I know the DC power draw is ~176W.

My A64-3200+ system has almost identical components except for 512mb more RAM, one less drive, and an ATI 9800-256 Pro instead of 9800XT VGA. I'd say the RAM & drive balance each other off; the 9800XT represents ~15W more power.

This system draws max 168W AC under the same load used for the ARM System P4 PC. With the Zalman PSU efficiency at ~73% at that power level, the DC power draw is 123W. Add 15W to that to compensate for the VGA card. We're at 138W. This is nearly 40W lower than the DC power draw of the P4-3.2 system

The AC power draw of the system is a very good indication of the total heat in a PC. Just add 20W to the 168W of the A64 system to compensate for the XT. Now you have 188W vs 235W; the heat difference in those two PCs is 47W.

How does this translate in terms of noise? The P4-3.2 system referenced above was judged as borderline quiet when at max load, right around the 30 dBA/1m mark. My A64-3200+, in a virtually identical ARM System Foundation Case / PSU Kit, runs no higher than 25 dBA/1m, and the temperature of the CPU never exceeds 55C (in a room with ambient up to ~26C this far.)

As for performance, it's significant to note that in the mainstream PCWorld magazine's July 2004 system rankings, of the seven top performance desktops, the four Athlon 64 machines scored 141-150 on the magazine's Worldbench 4 benchmark. The three P4 systems trailed at just 127-130.

For enthusiasts, perhaps nicest of all is the improved heatsink retention bracket on the motherboards and the protective "heat spreader" on the processor itself. It took a long time for AMD to follow Intel's lead on this, but it's better late than never.

*

Discuss this article in the SPCR Forum.

Google

www SPCR