The secret weapon of Ampere GPUs: new GDDR6X memory technology
Next-gen Nvidia GeForce GPUs are almost here, with reveal on September 1th, but we have interesting new information on them like some specs for the RTX 3080 and RTX 3090. But more importantly, it turns out Ampere introduces a complely new memory. Micron has confirmed developping special GDDR6X chips for Nvidia that will increase clocks and bandwidth by 50% over GDDR6. So what do we know so far?
There have been leaks mentioning GDDR6X memory for some time. Earlier they looked like unbelievable speculations, because it was still too early for a new technology, as GDDR6 is just from 2017/2018. And we haven’t heard a single word about this new standard before. Last week however, Micron revealed and confirmed that the GDDR6X in fact does exist. However, it is not a new standard, but a special (proprietary) technology developed by this company as a solo project in close cooperation with Nvidia.
The GDDR5X in the Pascal graphics cards was actually a similar case, but the preparations have remained secret for much longer this time. Nvidia may not have wanted to break this secret until Ampere has been revealed, but Micron has now accidentally released a document that revealed the GDDR6X technology and even shows the configuration with which it will ship in the GeForce RTX 3090 card. This document, which has already been deleted from the website again, was found out by the VideoCardz website.
GDDR6X: 19.0 to 21.0 GHz
The GDDR6X memories are still based on the concept of the GDDR memories, so they are nothing like HBM, they are still individual chips with a BGA package, mounted on the PCB of graphics cards around the GPU, as usual. The 32bit channel width is preserved. However, as already noted, the bandwidth will be increased by up to 50%. Today, the GDDR6 memories typically run at 14.0 GHz – the chips are capable of 16.0 GHz, but no GPU has managed that yet. However, according to Micron, the GDDR6X will offer clock speeds of 19.0 to 21.0 GHz – an increase of up to 50% compared to 14 GHz, hence the bandwidth should increase by that much.
So, for example, if a GPU has a 384-bit bus, then with 16.0GHz GDDR6, it would have a bandwidth of 768 GB/s. However, with today’s real 14.0GHz GDDR6, we only get 672 GB/s. On the 384-bit bus, the 19.0GHz GDDR6X will already provide 912 GB/s, and at an effective clock speed of 21.0 GHz, 1008 GB/s is achieved.
Micron has a table in the document with examples of various GPUs, according to which the GA102 chip in the Nvidia GeForce RTX 3090 graphics card, the most powerful Ampere SKU, will probably have the following configuration. It will allegedly carry 12 GDDR6X chips with a 384-bit bus, a clock speed of 19 to 21 GHz and a total capacity of 12 GB (but this capacity may not be final, more about that on the next page).
Signal encoding is the key
How did Micron manage to increase the effective frequencies this much? The company used a technology that is already in use by communication technologies (for example, by 200/400/800Gb/s Ethernet) and which will also be employed in PCI Express 6.0. Which is: so-called pulse amplitude modulation signalling technology. More precisely, it will be the PAM4 type of it.
Until now, all memory technologies have used the classic NRZ (non-return to zero) coding for their transmission interface. For the NRZ encoding, the interface switches between two signal levels, that mean zero or one, so that one bit is transmitted per unit of time.
Pulse amplitude modulation generates several possible values using the signal amplitude – in the case of PAM4 there are four – let’s say voltages of 1.25, 2.50, 3.75, and 5.0 V for example. The control unit can distinguish these levels, and because four different values can be encoded with two bits, PAM4 transmits exactly two bits per unit of time: that means double the data traffic. You can clearly see this in the following graph comparing the PAM4 and the NRZ encoding.
It is actually very similar to MLC recording in the NAND Flash memory (where the cell distinguishes four voltage levels and thus stores two bits) compared to SLC recording (two levels = 1 bit).
Thanks to the PAM4, the GDDR6X memories will transmit twice the data over the GDDR6 at the same real frequency, so the effective frequency is doubled over the GDDR6. For this reason, their real frequency may be lower in practice. For example, shile you actually need a 2000 MHz real clock for 16.0GHz effective clock speed with GDDR6 memory, GDDR6X only needs 1000 MHz ral clock. Memory at 19.0 GHz effective speed should have a real clock speed of 1187.5 MHz, 21.0GHz one would only run at 1312.5 MHz real clock. This means that the power consumption required for a certain throughput will be slightly better with the GDDR6X compared to GDDR6, although the difference will not be significant. In practice the power consumption will probably increase overall because the rise in bandwidth will be greater than the increase in efficiency.
The use of PAM4 probably complicates the technology quite a bit and PHYs will not be easy to design, but the reduction of real clocks probably makes the work easier. Therefore, further growth should be possible in the future. Micron says they can introduce GDDR6X memory at 24.0 GHz next year. With a 384-bit bus, this would achieve 1152 GB/s. Next year, GDDR6X chips with a capacity of 2 GB per piece are also to be produced; while only 1GB chips will be available this year.
Exclusively for Nvidia graphics cards?
GDDR6X is a Micron’s solo project, so the memory will only be available from them and not from Hynix or Samsung. Because it was co-developed with Nvidia, it is also possible that the GDDR6X will only ship in their graphics cards while Radeons will not be able to use these memories – either because Nvidia has secured exclusive right for the technology or because AMD wasn’t aware of this memory in advance.
In the following years, however, a similar technology with the PAM4 encoding could be standardized as the GDDR7. However, the development of such memory has not been reported yet, and because the standardization takes a while, its arrival is probably still a few years away.
- The secret weapon of Ampere GPUs: new GDDR6X memory technology
- Specifications of GeForce RTX 3080 and GeForce RTX 3090 in the leaks so far