GDDR7 memory for next-gen GPUs is ready, up to 48GHz clocks

A new generation of graphics memory is ready

Nvidia’s new generation of graphics cards, GeForce RTX 5000, and upcoming cards from AMD (Radeon RX 8000) and Intel are likely planned to use new GDDR7 graphics memory technology to deliver better bandwidth than both GDDR6 and GDDR6X. This technology has now been finalized by the JEDEC consortium and will allow effective speeds of up to 48 GHz, so there will be significant increases in bandwidth, up to 2–3× compared to current GPUs.

The GDDR7 specification was published today in final form as the “JESD239 Graphics Double Data Rate (GDDR7) SGRAM” standard. As expected, Nvidia and AMD have expressed support for this technology, so both companies should use this memory in the future (but Intel and perhaps the various aspiring Chinese GPU manufacturers probably will too).

On the other hand, this memory will be manufactured by Samsung, Hynix and Micron, all the traditional suppliers of graphics memory. They have been working on its development since before the specification was finalized and released, so the availability of the chips for graphics card manufacturers should not be far off.

According to JEDEC, the GDDR7 standard allows for memory bandwidths of up to 192 GB/s per chip. It should still have a 32-bit data width, which means that eventually, GDDR7 should run at effective speeds of up to 48 Gb/s per bus bit. Or, if you prefer, at an effective clock speed of 48 GHz. That’s double the maximum speeds of GDDR6 and GDDR6X (24 GHz effective). But most of today’s graphics cards with GDDR6 memory usually run it in the 16.0–18.0 GHz band, so in comparison, memory performance can be up to three times faster at the same bus width. However, these maximum values will probably only be reached over time, so this is something that could be true for GPU comparisons across some two or three generations.

To give you an idea, with that effective speed of 48 GHz, you could achieve 384 GB/s of bandwidth (which is what Radeon RX 6700 XT has) on a graphics card with a narrow 64-bit bus typical of extreme low-end GPUs like the Radeon RX 6500 XT. A cheap mainstream graphics card with 128-bit memory would already have a bandwidth of 768 GB/s (more than the GeForce RTX 3080). A high-performance GPU with 256-bit and 384-bit buses would then achieve bandwidths of 1536 GB/s and 2304 GB/s respectively, which is in the realm of fantasy at this moment, basically a HBM2/HBM2e territory now).

But as mentioned, slower chips will probably be used initially, perhaps with an effective speed of 32 GHz in the first wave of implementations. This would yield 256 GB/s, 512 GB/s, 1024 GB/s and 1536 GB/s with the previously mentioned bus widths.

PAM3 instead of PAM4 for GDDR6X memory

The secret behind the bandwidth increase in GDDR7 is the use of PAM3 signalling based on pulse amplitude modulation. It is conceptually similar to multi-bit (MLC/TLC) recording in NAND memory. This means that one tick of the electrical signal does not distinguish only two voltage values (0 and 1), but multiple voltage levels, in this case three – hence PAM3. This allows one tick to carry 1.5 bits instead of one bit (because two consecutive ticks together allow three bits to be encoded).

GDDR6X memory from Micron (source: Micron)

Interestingly, Micron’s GDDR6X uses PAM4 encoding, where there are four voltage levels and one tick of the signal directly transmits two bits of information on its own. However, GDD6X memory had a relatively high power consumption and interestingly the technology did not achieve a doubling of bandwidth (i.e., for example, an effective speed of 32GHz versus the 16GHz GDDR6 memory that was used at the time). Micron and Nvidia only reached sub-24GHz effective clock speed with GDDR6X in practice. That was only 50% more compared to the older GDDR6, however newer GDDR6 chips are also supposed to get to those same speeds.

Perhaps PAM4 is still too demanding for this application and the simpler PAM3 can ironically give better results due to higher achieved clock speeds.

More: Ampere GPUs have new PAM4-based GDDR6X memory, we have details

In addition to this more powerful encoding and higher clock speeds, GDDR7 memory also promises improvements to the training during system (GPU) initialization. This will improve the GPU’s ability to run memory at high clock speeds, while at the same time the training (and thus GPU startup) should be faster.

GDDR7 should also apparently have more subchannels (four instead of two), which should improve the bandwidth practically achieved from a certain level of theoretical bandwidth, i.e. the utilization of the theoretical memory speed. This is probably analogous to how DDR5 memory modules with 64-bit width are internally divided into two independent 32-bit channels.

In terms of capacities, the standard provides for chip sizes of 16 Gb to 32 Gb, i.e. 2–4 GB. Smaller 1GB chips will no longer exist, so 256-bit GPUs will have a minimum of 16GB of memory, up to a maximum of 32GB. According to the memory manufacturers’ roadmaps, there will probably be a middle ground in the form of 24Gb/3GB chips, which would allow, for example, 24GB of memory for a 256-bit GPU (or 12GB capacity for a 128-bit GPU).

Tip: More RAM and memory for graphics cards: Micron plans bigger GDDR7 and DDR5 chips

However, clamshell configuration, where twice the number of chips are used (two on a single 32bit interface), will continue to be supported, which allows capacities to be further doubled. Conventional graphics cards with GDDR7 memory could thus reach memory capacities of 64 and 96 GB, although this will probably only be used by professional cards, not gaming ones – for price reasons.

ECC

GDDR7 memory is apparently intended to support ECC directly at the die level. Thus, there will be no need for redundant chips for ECC, each chip will have its own parity or ECC information and thus be able to detect and to some extent correct errors. This is already the case with DDR5, but in its case there is an important limitation or flaw that the information about ECC events (detected, corrected and uncorrected errors) doesn’t get passed to memory controller in the CPU, so the main advantage of ECC is essentially not useful. In fact, there are fears that manufacturers will just exploit the ECC to just increase the clock speeds to the levels where the dies actually are suffering errors but the ECC keeps hiding the errors (or to sell marginal or defective chips that are reliant on ECC to even work). By keeping the user uninformed about ECC-corrected errors, they can afford to do this, whereas with DDR4, the clock speeds have to be safe enough that no errors happen. But if any uncorrectable errors already occur with DDR5 (due to going more on edge with the clock speed), the memory controller in the CPU doesn’t know about them and you’re in the same situation as with DDR4 without on-die ECC.

With GDDR7, memory errors information should be passed to the memory controllers and therefore the whole system should perhaps match classic ECC memory with its benefits to reliability. Data poisoning, error detection and scrubbing technologies should also be supported. Although it is probably possible that not all chips will have all these RAS features (some could be merely optional, for example).

Room for GPU gaming performance growth for years to come

In any case, the introduction of GDDR7 memory with its relatively high target clock speeds means that the way is open for smooth increases to GPU gaming performance over the next several years. Gaming performance is limited by GPU memory bandwidth a lot, and particularly higher resolutions like 4K, 5K and 8K need high bandwidth. Thus, adding compute units and clock speeds (and theoretical raw performance in “TFLOPS”) is not enough to increase GPU performance, it has to be matched with adequate improvements to memory bandwidth.

With the release of GDDR7, that second part should be provided, so it will just be up to GPU manufacturers to keep increasing the raw compute performance (and the performance of raytracing units and other fixed-function graphics units in GPUs).

Source: JEDEC (via: TechPowerUp)

English translation and edit by Jozef Dudáš


  •  
  •  
  •  
Flattr this!

Leave a Reply

Your email address will not be published. Required fields are marked *