Intel launches new Emerald Rapids Xeon CPUs with fast cadence

Intel Xeon Scalable 5th generation released

Last week, Intel released 4nm Meteor Lake processors with long-awaited chiplet architecture. At the same time, Intel is also releasing the second generation of chiplet Xeons called “Emerald Rapids”. It follows very quickly after the previous “Sapphire Rapids” and, along with various improvements and fixes, it curiously walks back on chiplets a bit, as it uses only two instead of four. This has not only reduced costs but also power draw.

The Emerald Rapids processors, now released as the 5th generation Xeon Scalable, are to a degree a new generation of because they are newly designed chips, but at the same time they bear a strong resemblance to the Refresh because they increase the number of cores by only a small amount (from 60 to 64) compared to the Sapphire Refresh, they use the same platform with the same connectivity, and even the architecture of the CPU cores has remained completely unchanged, according to Intel.

Two chiplets instead of four

The goal of this generation was probably mainly to simplify the complexity of the Sapphire Rapids processors, which are made up of four chiplets (which are not identical, the CPU is made up of two types because of how the tiles stack together). That the Sapphire Rapids is an overly complex product is perhaps evidenced by the long (and repeatedly increasing) delays, as well as the fact that Intel needed an unprecedented number of respins to finally be able to release the CPUs  commercially. And the CXL functionality, for example, was still not fully functional.

Intel Xeon Scalable 5th Generation (Emerald Rapids) CPU. Author: Intel

Emerald Rapids is composed of two chiplets, each containing 32 cores. The use of the LGA 4677 socket, eight-channel DDR5-5600 memory, PCI Express 5.0, CXL 1.1 and Intel’s 7nm manufacturing node remains. The biggest change is that Intel has significantly increased the L3 cache. The fully active 64-core model contains a total of 320 MB of L3 cache, while the top-end 60-core Sapphire Rapids carried 112.5 MB of L3 cache. The L3 cache block associated with each core has increased from 1.875 MB to 5 MB.

It is the large L3 cache that will probably be the main attraction for buyers and the main benefit of this generation. In applications whose operating data is not very large and do not benefit much from L3 cache, performance will probably stay without much change (due to the same core architecture). Conversely in cases where the memory subsystem on Sapphire Rapids was a bottleneck, the results may improve a lot. Emerald Rapids has other improvements as well, for example CXL 1.1 support should finally be complete and support not only Type 1 and 2, but also Type 3 devices (which Sapphire Rapids did not support, likely due to silicon implementation flaws).

Slides for Intel Xeon Scalable 5th Generation Emerald Rapids processors (Author: Intel, via: ServeTheHome)

Intel states that turbo boost behavior has been improved and processors should often be able to run at higher clock speeds than before. At the same time, however, the processor downclocking tiers has been extended again when executing demanding instructions (the downclocking was introduced to handle the extra power draw of AVX/AVX2 and later AVX-512 operations). Previously, the processor had four different instruction complexity classes (0, 1, 2, 3), which indicated how high reduction in clock speeds these operations caused. Intel has now created a new class 4 for the most demanding (“heavy”) AMX instructions, which were previously in class 3 in one lot with the most demanding AVX-512 operations. Only “AMX moderate” will now remain in class 3, which could mean that the processor will now downclock a bit less for some AVX-512 and AMX operations than when they were categorised in the lowest lot.

Slides for Intel Xeon Scalable 5th Generation Emerald Rapids processors (Author: Intel, via: ServeTheHome)

Simplifying the CPU design to two chiplets (tiles) has reduced the number of silicon links between tiles, as well as the communication traffic between cores that must go outside the local tile. It seems that even with the Foveros technology, the chiplet interconnect is still a major hit negatively affecting power draw and power efficiency. In fact, thanks to this simplification, Intel was reportedly able to reduce the idle power draw of the processor, which was quite high with Sapphire Rapids, by up to 100W per CPU socket.

Smaller processors are monolithic

It’s worth noting that the new Emerald Rapids silicon should probably only power the more powerful SKUs. In the Sapphire Rapids generation (4th generation Xeon Scalable), Intel designed a separate monolithic silicon with 32 cores, the so-called MCC silicon, for the lower-end models. This will probably now also be sold in the Emerald Rapids generation (presumably with the older smaller L3 cache). An even smaller LCC silicon with 20 cores should also be coming.

Slides for Intel Xeon Scalable 5th Generation Emerald Rapids processors (Author: Intel, via: ServeTheHome)

According to Intel, on average, Emerald Rapids processors deliver up to 21% better performance than Sapphire Rapids (+42% in AI inference tasks, up to +40% in HPC), and up to 36% better power efficiency. However, these are official marketing benchmarks, for a realistic performance comparison you need to look at independent reviews.

Intel Xeon Scalable 5th Generation (Emerald Rapids) CPU. Author: Intel

Models

Emerald Rapids processors once again have quite an expansive set of models due to the existence of numerous special-purpose models and Intel’s policy of deactivating various specialized accelerators on some models and creating more SKUs with the functionality enabled. The most powerful model in the general purpose SKUs lineup is the Xeon Platinum 8592+ with 64 cores, 128 threads, 320MB L3 cache. Its base clock speed is 1.9 GHz and the maximum boost is 3.9 GHz, but the all-core turbo is just 2.9 GHz. It officially costs 11,600 USD and has a 350W TDP.

Intel Xeon Scalable 5th Generation (Emerald Rapids) CPU. Author: Intel

For cloud customers, however, there is an alternative 8592V version with a base 2.0 GHz clock speed and 330W TDP (the 3.9 GHz boost and 2.9 GHz all-core boost clock speeds are the same) for 10,995 USD. This model only has three of the four UPI lanes active, and also reduces RAM support from DDR5-5600 to DDR5-4800 for some reason.

The peak performance model should probably be the 12,400 USD Xeon Platinum 8593Q, which is aimed at liquid-cooled servers and increases the TDP to 385 W. This 64-core/128-thread processor has a base clock speed of 2.2 GHz and an all-core boost of 3.0 GHz, but the maximum boost is still the same 3.9 GHz.

You can see all the models in these charts:

Models and Parameters of 5th Generation Intel Xeon Scalable Processors (Emerald Rapids) Author: Intel, via: Tom’s hardware
Models and Parameters of 5th Generation Intel Xeon Scalable Processors (Emerald Rapids) Author: Intel, via: Tom’s hardware

Sources: Intel, ServeTheHome, Tom’s Hardware

English translation and edit by Jozef Dudáš


  •  
  •  
  •  
Flattr this!

Leave a Reply

Your email address will not be published. Required fields are marked *