Desktop processors with Zen 5 cores revealed
After a long wait, it’s here. During its presentation at Computex 2024, AMD unveiled the Ryzen 9000 desktop CPUs, the first of the generation of CPUs based on the Zen 5 architecture. We now have confirmed specifications and also the IPC of this architecture (the officially stated value, at least). According to AMD, these are the fastest “consumer PC” processors of today, and the company has already shown the first performance claims.
The Zen 5 architecture: the biggest core changes since the first Zen
Not much has been revealed about architectural details yet, and we may learn more with the reviews in July or August when the core is presented at the Hot Chips conference in august (we should definitely get deeper details at that point). AMD has confirmed that this is a core newly designed from the ground up like the first Zen and Zen 3 were, not an evolution of Zen 4. The scope of the changes seems to be larger than in Zen 3. Zen 5 is supposed to combine “very high performance” and “extreme energy efficiency”, but of course those words could mean anything, so on those matters we’ll have to wait for independent tests.
Perhaps the most interesting feature of Zen 5 that has been revealed is the use of two parallel pipelines in the frontend. In theory, this could be something similar to the parallel decoder clusters in Intel’s small cores (Tremont, Gracemont and Skymont, which should also be unveiled soon and has as many as 3×3 decoders in parallel). However, it could be other parts of the pipeline than instruction decoding that are duplicated, so consider this just for illustration.
The purpose of these duplicated pipelines in the frontend is to make the processor better able to handle branches in the program code, with lower misprediction rates and also lower latency, to minimize delays in execution and stalls of the execution units themselves caused by the branch handling.
The core should also have more compute capacity in the backend (i.e. the execution units themselves). This means addition of more ALUs to the core (there should be six instead of the current four, but AMD hasn’t officially confirmed this yet).
Zen 5 should also see its SIMD units expanded to the full native AVX-512 operation width, from 256 bits to 512 bits. This doubles the theoretical compute performance (throughput) in SIMD code – provided the AVX-512 instructions are used, of course – because the 512-bit instructions will be executed in one cycle instead of two as in Zen 4. But you’ll typically only see a part of this “2×” potential manifested in programs. In highly optimized tasks using full vectorization, numerical computations and microbenchmarking, however, the full 2× increase could be observed.
Another place where performance can (locally) be doubled is data bandwidth. It his has been doubled between the L1 and L2 caches, and the data bandwidth between the L1 cache and the SIMD/FPU execution units has also been doubled. This should ensure that code taking advantage of the AVX-512’s doubled performance is not bottlenecked by the working data not being able to flow out of and back into the caches fast enough. This doubling of bandwidth should probably be realized by expanding the data paths to double width, which means the load/store pipelines probably perform 512-bit reads and writes per cycle.
AMD also talks about “double instruction bandwidth” in the frontend. It’s not entirely clear whether this is talking about reading instructions from the L1 cache (“fetch”), or decoding (which would mean eight instruction decoders, or 2×4 if they’re in two clusters), or the ability to perform branching, i.e., processing twice as many branches per cycle.
On the other hand, the L2 cache capacity remains at 1 MB (private in each core) and the L3 cache still uses 32 MB in each of the CPU core dies. Thus, dual-die models have 64 MB of L3 cache, single-die models have 32 MB.
The core should also have deeper buffers and out-of-order execution instruction queues, including the so-called Reorder Buffer (RoB), which is a “window” of code within which the processor can reshuffle instructions and execute ahead those future instructions that have no blocking dependencies. Zen cores have had their RoBs relatively limited in depth until now, for example there was a two-fold difference between Zen 3 and Intel’s Golden Cove (256 versus 512 entries RoB). It will be interesting to see how much Zen 5 leaps forward in this characteristics.
First official information about IPC
All of these changes should contribute to the core having improved IPC, meaning improved performance per 1 MHz. However, the benefits may not be as high as one would expect given the large changes to the Zen 5 architecture. That might just be due to the newness of the design.
Zen 4 built on the proven design of Zen 3 and had more opportunity to fix weaknesses or bugs and address performance limitations that were discovered while working on Zen 3, but whose solution did not make the deadline for inclusion to the original architecture. In contrast, by being such a new design, Zen 5 will probably contain more cases where such newly discovered limitations are not yet addressed, and these are opportunities for improvement in future generations – at least in Zen 6, which should probably come out again in a year and a half to two years.
AMD showed a chart at Computex with various benchmarks and the IPC increase in them for Zen 5 compared to the previous Zen 4 generation. These improvements range from +10% (in Far Cry 6) to +23% (Blender) to +35% (AES XTS cryptography in the Geekbench 5.4 test, this test benefits from AVX-512).
Note: These are probably not strictly single-threaded applications that we are used to see in IPC measurements, AMD seems to be using multi-threaded tests for some reason. The test should be with both cores running at 4.0 GHz. It’s mentioned in passing that an 8-core Ryzen 7 7700X processor was used for Zen 4, but that doesn’t make much sense (unless the Ryzen 9 9950X used only half its cores and it was therefore an 8-core / 16-thread scenario for both CPUs).
+16% performance at the same clock speed
Overall, this should result in a geometric mean of +16% IPC, so a bit more than what was in the leaked slides a while ago. According to those, AMD believed or promised that the performance improvement per 1 MHz (IPC) of Zen 5 would end up in the range of 10–15+%. One could assume that AMD always wants to achieve a bit more than the upper end of the ranges given in preliminary projections, and the now announced +16% is just barely higher than the upper bound.
But it’s possible that not everything worked out quite as expected with the new core, given that the expectations were beaten by a just single percentage point (also recall that for Zen 3, AMD claimed +19% IPC improvement). One wonders if AMD een hasn’t doctored the average to reach 16% by carefuly choosing which tests to include in this chart and average. For example, it is somewhat questionable that the composite overal scores of Geekbench 5 and Geekbench 6 tests not included directly and AMD only cherry-picked their specific subtests. This could have been done to make the resulting IPC percentage look better. But on the other hand, there probably are tests where the increase will be higher than the +35%, we would expect to find even higher numbers from applications profiting from AVX-512. Anyway, better to wait for real tests before verdicts.
Still using AM5, new CPU chiplets with an old IO chiplet
AMD CEO Lisa Su showed a sample of a processor without the usual metal lid (IHS) at Computex, which confirms that the desktop Ryzen 9000 processors (whose codename is Granite Ridge) are still designed in a chiplet style – with one IO die and two CPU dies on an ordiary substrate (advanced packaging, which could improve power efficiency, will first be used in Zen 6 at the earliest). What should be new in the processor are the CPU dies with Zen 5 cores that use TSMC’s 4nm node (N4).
The IO die is apparently the same (manufactured by TSMC’s 6nm node, N6), so the same PCI Express 5.0 controller with 16+4+4 lanes and 128-bit wide DDR5 controller (used to be called “dual-channel”) is retained. However, now the officially supported memory clock speed is DDR5-5600, while with Ryzen 7000 it was only DDR5-5200. On the other hand, the integrated GPU with 128 RDNA 2 architecture shaders at 2200 MHz remains.
The AM5 socket is still used, therefore these processors will be usable as an upgrade in any AM5 board. As before, all models are unlocked for overclocking and will not have a cooler included. The maximum operating temperature, by the way, is 95°C, as in the previous generation.
Four processors, 65W, 120W and 170W TDP
The models have also been revealed, and they’re exactly the four processors with 6, 8, 12 and 16 cores whose specs were leaked over the weekend – but now we have the previously missing specs too.
The top of the range SKU is the Ryzen 9 9950X with 16 Zen 5 cores and 32 threads. Its maximum boost is 5.7 GHz – unchanged from the Ryzen 9 7950X with Zen 4 architecture. The processor has 32 MB of L3 cache in each of the two CPU dies of which it will consist, for a total of 64 MB. And the 170W TDP is also retained, which implies that the maximum boost power consumption (continuously consumable) is 230W.
The base clock speed is 4.3GHz, which is 200MHz lower than the previous generation Ryzen 9 7950X. While the Zen 5 has CPU dies manufactured on a 4nm node, due to the complex architecture, it may consume slightly more power at a given clock speed. At the very least, this could probably be true when using AVX-512 instructions due to the 2× wider 512-bit units. The base clock speed might probably have been lowered for this reason.
The second model in the lineup is the Ryzen 9 9900X with 12 cores and 24 threads. Again, it has 2 × 32 MB L3 cache and its maximum boost is 5.6 GHz. So the boost clock is again unchanged against the 7900X from 2022. However, this model falls into a more power-efficient class, instead of 170 W the TDP will be 120 W (the actual PPT maximum power consumption is 170 W). This model also has a 300 MHz lower base clock speed, just 4.4GHz. However, given the reduced TDP here, this is easy to understand.
The other two models with a single CPU die also have lower TDP. The Ryzen 7 9700X has 8 cores with 16 threads, 32MB L3 cache and the TDP is 65W (implying the maximum PPT power consumption is 88W). The maximum boost clock speed of the processor is 5.5 GHz, this time it is already a 100MHz improvement over the previous generation. The base clock speed is 3.8GHz, which is again a pretty big reduction (the 7700X has a base of 4.5GHz).
The cheapest model, the Ryzen 5 9600X, will still only have six cores and 12 threads. It also has a 65W TDP and 32MB of L3 cache. For this processor, AMD has also increased the maximum boost by 100 MHz, to 5.4GHz (the 7600X model has 5.3GHz). However, the base clock speed has been reduced from 4.7 to just 3.9 GHz compared to the 105W Ryzen 5 7600X.
But the base clock speeds of these 65W models should probably be compared to the 65W models of the Zen 4 generation – the Ryzen 7 7700 has a base of 3.8GHz and the Ryzen 5 7600 does too. So the clocks won’t represent a drastic drop compared to them.
First official benchmarks
AMD also showed some official benchmarks of the Ryzen 9 9950X, which you can see on the following slide. These are tests done against the competing 8+16 core Intel Core i9-14900K. The Zen 5 has relatively smaller wins in the benchmarking tools Procyon Office (+7%) and Puget Photoshop (+10%), while in the programs that apparently benefit from AVX-512, the performance is up by as much as +55–56% (Handbrake, Blender). An interesting indicator could be the Cinebench R24 multithreaded test, in which performance is 21% better against the Intel processor.
In games, AMD should have the “lead” according to its slide, with Ryzen 9 95950X being +4% to +23% faster compared to the Core i9-14900K depending on the games measured. This should be while using DDR5-6000 memory on both platforms.
Note that the Core i9-14900K processor is probably performing slightly worse in this testing (conducted by AMD) than in its original reviews from the fall, as it is set to “Intel Defaults”. This is a change that Intel is now retroactively making with board manufacturers because of the widely reported instability problems (which reminds us that there hasn’t been the official statement on the causes and solutions to the problem expected during May yet, just the arrival od the revised BIOSes from board manufacturers). But the fact that AMD used these settings rather than the essentially overclocked values previously used by most motherboards seems perfectly legitimate in a situation where Intel makes the use of these settings by default mandatory as well.
Keep in mind, however, that such official benchmarks may alwaysbe embellished by various choices of settings or by selecting specific tests to include and tests to omit. So take the numbers with a grain of salt for now and wait for independent tests instead.
Release in July, prices not yet known
The release timeline of Ryzen 900s hasn’t been narrowed tor an exact date yet, but it will take place sometime next month, it’s been officially confirmed, and that means real availability. It’s not clear yet whether we’ll find out the pricing of these processors only when they finally hit the shelves, or if they MSRPs will be officially announced some time in advance. Also, the reviews could probably be published earlier than on the launch day, but we don’t know anything about their NDA lift date yet.
Sources: AMD (1, 2, 3), AnandTech
English translation and edit by Jozef Dudáš
⠀