Why Intel Processors Draw More Power Than Expected: TDP and Turbo Explained

Friday, November 9th, 2018 - CPUs, Teknologi

Why Intel Processors Draw More Power Than Expected: TDP and Turbo Explained

One of the recent topics permeating through the custom PC space recently has been about power draw. Intel’s latest eight-core processors are still rated at a TDP of 95W, and yet users are seeing power consumption north of 150-180W, which doesn’t make much sense. In this guide, we want to give you a proper understanding why this is the case, and why it gives us reviewers such a headache.

Advertisement

What is TDP (Thermal Design Power)?

With every processor, Intel guarantees a specific frequency at a specific power, often with a particular grade of cooler in mind. Most people equate a chip's TDP rating directly to its maximum power draw, given that the heat energy that needs to be dissipated from the processor is equal to the power consumed in doing calculations. Normally, the TDP rating is that specific power.

But TDP, in its strictest sense, relates to the ability of the cooler to dissipate heat. TDP is the minimum capacity of the CPU cooler required to get that guaranteed level of performance. Some energy dissipation also occurs through the socket and motherboard, which means that technically the cooler rating can be lower than the TDP, but in most circles TDP and power consumption are used to mean the same thing: how much power a CPU draws under load.

Within a system, the TDP is a value that can be set in the firmware. If a processor used the TDP as its maximum power limit, then we would see the same benchmark produce graphs like this with a high-powered, many-core processor.

For the last however many years, this is the definition of TDP that Intel has used. For any given processor, Intel will guarantee both a rated frequency to run at (known as the base frequency) for a given power, which is the rated TDP. This means that a processor like the 65W Core i7-8700, which has a base frequency of 3.2 GHz and a turbo of 4.7 GHz, is only guaranteed to be at or below 65W when the processor is running at 3.2 GHz. Intel does not guarantee any level of performance above this 3.2 GHz / 65W value.

On top of the base values, Intel implements Turbo. As mentioned, something like the Core i7-8700 can have a turbo of 4.7 GHz, which draws a lot more power than the processor running at 3.2 GHz. The all-core turbo value for a processor like the Core i7-8700 is 4.3 GHz, which is well above the guaranteed 3.2 GHz. What makes it all the more complicated is when none of those turbo modes go down to the base frequency. It means that the processor will be operating above its TDP rating all the time, and that 65W cooler you purchased (or perhaps it even came with the processor) has become a bottleneck of sorts. If more performance is required, it needs to go in the bin, as you’ll need something better.

But the manufacturer doesn’t tell you that. If the cooling isn’t sufficient for the turbo modes, and the processor reaches its temperature limit, most processors will go into a power limited mode, reducing performance to stay within that power limit. All of a sudden that fast processor isn't living up to its peak capabilities.

So TDP is Meaningless? Why is it now an issue?

Over the last decade, while the use of the term TDP has not changed much, the way that its processors use a power budget has. The recent advent of six-core and eight-core consumer processors going north of 4.0 GHz means that we are seeing processors, with a heavy workload, go beyond that TDP value. In the past, we would see quad-core processors have a rating of 95W but only use 50W, even at full load with turbo applied. As we add on the cores, without changing the TDP on the box, something has to give.

The Secret Numbers Not On The Box

Inside each processor, Intel defines several power levels based on the capabilities and expected operating environments. This sounds all well and good, however these power levels and capabilities can be adjusted at the firmware level, allowing OEMs to decide how they want the processors to perform in their systems. Ultimately it gives a really fuzzy reading at exactly what the power consumption of a processor will be when it is in a system.

To simplify, there are three main numbers to be aware of. Intel calls these numbers PL1 (power level 1), PL2 (power level 2), and T (or tau).

PL1 is the effective long-term expected steady state power consumption of a processor. For all intents and purposes, the PL1 is usually defined as the TDP of a processor. So if the TDP is 80W, then PL1 is 80W.

PL2 is the short-term maximum power draw for a processor. This number is higher than PL1, and the processor goes into this state when a workload is applied, allowing the processor to use its turbo modes up to the maximum PL2 value. This means that if Intel has defined a processor with a series of turbo modes, they will only work when PL2 is the driving variable for maximum power consumption. Turbo does not work in PL1 mode.

Tau is a timing variable. It dictates how long a processor should stay in PL2 mode before hitting a PL1 mode. Note that Tau is not dependent on power consumption, nor is it dependent on the temperature of the processor (it is expected that if the processor hits a thermal limit, then a different set of super low voltage/frequency values are used and PL1/PL2 is discarded).

Here are Intel's official definitions:

So let us go on a journey where a large workload is applied to a processor.

Firstly, it starts in PL2 mode. If a single-threaded workload is used, then we should hit the top turbo value as listed in the spec sheet. Normally the power consumption of a single core will be nowhere near the PL2 value of the entire chip. As we load up the cores, the processor reacts by reducing the turbo frequency in line with the per-core turbo values dictated by Intel. If the power consumption of the chip hits the PL2 value, then the frequency is adjusted so PL2 is never exceeded.

When the system has a substantial workload applied for a length of time, in this case ‘tau’ seconds, the firmware should immediately invoke PL1 as the new power limit. The turbo tables no longer apply, as those are PL2 only.

If the workload applied results in power consumption levels above PL1, then the frequency and voltages are adjusted such that the overall power consumption of the chip is within the PL1 value. This means that the whole processor reduces in frequency from its PL2 state to its PL1 state for the duration of the workload. This means that temperatures on the processor should decrease, increasing the longevity of the processor.

PL1 stays in place until the workload is removed and a CPU core hits an idle state for a fixed amount of time (usually sub 5-seconds). After this, the system can re-enable PL2 again if another workload is applied.

So some examples of numbers here – Intel lists several in its specification sheets for the different processors. In this case, I will take a consumer grade Core i7-8700K. For this processor, we have the following:

  • PL1 = TDP = 95 W
  • PL2 = TDP * 1.25 = 118.75 W
  • Tau = 8 seconds

In this case, the system should be able to boost up to ~119W for eight seconds, before being pulled back down to 95W. Intel has had this in place for a number of generations of processors, and most of it didn’t actually matter, as the power draw for the full chip was often well below the PL1 value even at full load.

However, this is where it gets really stupid: the motherboard vendors got involved, because PL1, PL2 and Tau are configurable in firmware. For example in the graph above, we can set PL2 to an unlimited value, and then PL1 to 165W and 95W respectively.

A World of Random Numbers

For this I’m going to be drawing a lot from the consumer space, given that this is mostly where it takes place. Actually, to be honest, PL1, PL2, and Tau are often meticulously controlled in thermally-limited environments such as laptops or small-form-factor PCs. I have seen a number of high-end but stylish designs actually set PL2 to TDP as well, ensuring that the processor does get some turbo but only as long as that 1-2 core load doesn’t push above TDP.

However, in our CPU reviews, since the advent of six-core processors, we often see numbers much higher than either PL1 or PL2, and they are sustained ad infinitum unless a temperature limit is hit. Why? Who would do such a thing?

Any modern BIOS system, particularly from the major motherboard vendors, will have options to set power limits (long power limit, short power limit) and power duration. In most cases, at default settings, the user won’t know what these are set to because it will just say ‘Auto’, which is a codeword for ‘we know what we want to set it as, don’t worry about it’. The vendors will have the values stored in memory and use them, but all the user will see is ‘Auto’.  This lets them set PL2 to 4096W and Tau to something very large, such as 65535, or -1 (infinity, depending on the BIOS setup). This means the CPU will run in its turbo modes all day and all week, just as long as it doesn’t hit thermal limits.

Why do the vendors do this? There are multiple possible reasons, although the vendor’s individual specific reason may vary between them.

Firstly, it means that users can have turbo on all the time, and get the all-core turbo every second of every day. This means that the benchmark scores go through the roof, and it looks good on reviews, or when people are comparing numbers.

Secondly, that the products are engineered for it. Intel often puts a default specification for a motherboard with every launch (they even used to have their own retail motherboards), which has a specific number of power phases with an expected life cycle. Vendors can obviously do their own thing here: more powerful phases, or more phases, or arranging the power delivery to improve efficiency, etc. If their board can take a full all-core turbo all the time, then why not?

Third, at least for the more expensive models aimed at enthusiasts, they know that high-end cooling is going to be used. If the processor is taking over 160W, if the user has decent cooling, then a high all-core turbo will improve the experience. Intel’s standards are defined under Intel’s recommended coolers.

So What is Correct? Who Can We Trust? What is the Difference?

Intel sets a standard for its parts. PL1, PL2, Tau, the motherboard circuitry, and the firmware settings all have Intel recommended default values. Some of these are public, such as the ones that Intel publishes in its documents, and some of them are confidential (and Intel won’t tell us no matter how many times we ask). However, these are still recommended values. At the end of the day, the motherboard manufacturers can do what they like. And they do.

As a result, from my side of the fence at least, it makes the job of testing the hardware more difficult. There are some readers that will want all of our settings to be one of the following:

  1. ‘Intel Recommeded’,
  2. ‘out of the box’,
  3. or ‘pushed to the limit’.

As you might imagine, Intel Recommended is likely to show much lower performance than ‘out of the box’, while ‘pushed to the limit’ speaks for itself.

It should be noted that up until this point, almost every stock test in every CPU review in existance has been run at 'out of the box', and NOT 'Intel Recommended'.

To give some context on benchmark values, we used a high powered CPU and achieved the following in a 25-30 second fully-loaded test:

AnandTechPL2TauPL1Result
Unlimited4096W999s4096W100%
Intel Spec, 165W207W8s165W98%
Constant 165W165W1s165W94%
Intel Spec, 95W118W8s95W84%
Constant 95W95W1s95W71%

Recently it has been reported that some motherboard manufacturers are actually changing the PL1/PL2/Tau strategy, and putting the Tau value to something reasonable, like 30 seconds or so. When users are running benchmarks in those motherboards, the results they are seeing are lower than what they are used to, even though those results are closer to Intel's specifications.

The thing is, with motherboards showing ‘Auto’, they (often) never actually disclose what the value is behind the scenes. This makes it very difficult to report. Also, these values can change depending on which processor is installed, based on the internal look up table.

Here at AnandTech, we mostly do ‘out of the box’, except for memory which we adhere to CPU manufacturers recommended support guidelines. We think it’s the fairest way to provide our readers what performance they should expect when next to zero settings are touched. What this usually means in reality, for the variables above, is that PL2 is set to something super high, and Tau is set for something super long. We see turbo all the time, as long as we can keep the temperatures within expected limits.

The Situation Today, and What Do We Do?

Writing an article like this has been on my mind for a while, since the Kaby Lake launch at least. Most processors we test in consumer motherboards usually go the unlimited PL2 route, and that has been the norm for years. It wasn't until we saw some of the general Core i9-9900K results were things started to look odd. It came to a head in our recent Xeon E article this week, where our Supermicro motherboard follows Intel specifications to the letter. It may seem obvious that a more commercial/server focused product would adhere strictly to Intel specifications, but this was the first time I had actually seen it. It is painfully obvious that consumer motherboards do not run at these specifications, and they never really have. I'd hazard a guess that Intel's own benchmark numbers (and AMD's benchmarks of Intel processors) using consumer motherboards do not follow Intel specifications either. 

So where do we go from here? I'd argue that Intel needs to put two power numbers on the box: 

  • TDP (Peak) for PL2 
  • TDP (Sustained) for PL1

This way Intel and other can rationalise a high peak power consumption (mostly), as well as the base frequency response that is guaranteed.

If users want the consumer motherboards to change, that is going to be tougher. All of the motherboard vendors like to get a one-up, which is why we see features like Multi-Core Turbo (which we reported on in 2012) sometimes defaulting to 'on'. Motherboard manufacturers prefer the 'Unlimited PL2' route, because it puts their results at the top of benchmark lists. The knock on effect is on CPU reviews. As a counterpoint, laptops with set cooling in thermally limited scenarios often set their own PL1, PL2, and Tau, and (often) follow it to the letter.

Question is, how important are 'Intel Specifications' for a desktop Intel processor? If we should be following Intel's specifications to the letter, should we go one step further, and only use stock coolers too?

Source link : Why Intel Processors Draw More Power Than Expected: TDP and Turbo Explained

Advertisement

Pictures gallery of Why Intel Processors Draw More Power Than Expected: TDP and Turbo Explained

Why Intel Processors Draw More Power Than Expected: TDP and Turbo Explained | admin | 4.5