Ampere GPU: new PAM4-based GDDR6X memory & more details

Jan Olšan

4 years ago

The secret weapon of Ampere GPUs: new GDDR6X memory technology

Next-gen Nvidia GeForce GPUs are almost here, with reveal on September 1th, but we have interesting new information on them like some specs for the RTX 3080 and RTX 3090. But more importantly, it turns out Ampere introduces a complely new memory. Micron has confirmed developping special GDDR6X chips for Nvidia that will increase clocks and bandwidth by 50% over GDDR6. So what do we know so far?

There have been leaks mentioning GDDR6X memory for some time. Earlier they looked like unbelievable speculations, because it was still too early for a new technology, as GDDR6 is just from 2017/2018. And we haven’t heard a single word about this new standard before. Last week however, Micron revealed and confirmed that the GDDR6X in fact does exist. However, it is not a new standard, but a special (proprietary) technology developed by this company as a solo project in close cooperation with Nvidia.

The GDDR5X in the Pascal graphics cards was actually a similar case, but the preparations have remained secret for much longer this time. Nvidia may not have wanted to break this secret until Ampere has been revealed, but Micron has now accidentally released a document that revealed the GDDR6X technology and even shows the configuration with which it will ship in the GeForce RTX 3090 card. This document, which has already been deleted from the website again, was found out by the VideoCardz website.

GDDR6X: 19.0 to 21.0 GHz

The GDDR6X memories are still based on the concept of the GDDR memories, so they are nothing like HBM, they are still individual chips with a BGA package, mounted on the PCB of graphics cards around the GPU, as usual. The 32bit channel width is preserved. However, as already noted, the bandwidth will be increased by up to 50%. Today, the GDDR6 memories typically run at 14.0 GHz – the chips are capable of 16.0 GHz, but no GPU has managed that yet. However, according to Micron, the GDDR6X will offer clock speeds of 19.0 to 21.0 GHz – an increase of up to 50% compared to 14 GHz, hence the bandwidth should increase by that much.

So, for example, if a GPU has a 384-bit bus, then with 16.0GHz GDDR6, it would have a bandwidth of 768 GB/s. However, with today’s real 14.0GHz GDDR6, we only get 672 GB/s. On the 384-bit bus, the 19.0GHz GDDR6X will already provide 912 GB/s, and at an effective clock speed of 21.0 GHz, 1008 GB/s is achieved.

Micron has a table in the document with examples of various GPUs, according to which the GA102 chip in the Nvidia GeForce RTX 3090 graphics card, the most powerful Ampere SKU, will probably have the following configuration. It will allegedly carry 12 GDDR6X chips with a 384-bit bus, a clock speed of 19 to 21 GHz and a total capacity of 12 GB (but this capacity may not be final, more about that on the next page).

This table from Micron shows the specs of GeForce RTX 3090 with 12 GB GDDR6X memory (Source: Micron)

Signal encoding is the key

How did Micron manage to increase the effective frequencies this much? The company used a technology that is already in use by communication technologies (for example, by 200/400/800Gb/s Ethernet) and which will also be employed in PCI Express 6.0. Which is: so-called pulse amplitude modulation signalling technology. More precisely, it will be the PAM4 type of it.

Until now, all memory technologies have used the classic NRZ (non-return to zero) coding for their transmission interface. For the NRZ encoding, the interface switches between two signal levels, that mean zero or one, so that one bit is transmitted per unit of time.

Pulse amplitude modulation generates several possible values using the signal amplitude – in the case of PAM4 there are four – let’s say voltages of 1.25, 2.50, 3.75, and 5.0 V for example. The control unit can distinguish these levels, and because four different values can be encoded with two bits, PAM4 transmits exactly two bits per unit of time: that means double the data traffic. You can clearly see this in the following graph comparing the PAM4 and the NRZ encoding.

The NRZ and PAM4 encoding, which has twice the data density at the same real frequency (Source: Intel)

It is actually very similar to MLC recording in the NAND Flash memory (where the cell distinguishes four voltage levels and thus stores two bits) compared to SLC recording (two levels = 1 bit).

Thanks to the PAM4, the GDDR6X memories will transmit twice the data over the GDDR6 at the same real frequency, so the effective frequency is doubled over the GDDR6. For this reason, their real frequency may be lower in practice. For example, shile you actually need a 2000 MHz real clock for 16.0GHz effective clock speed with GDDR6 memory, GDDR6X only needs 1000 MHz ral clock. Memory at 19.0 GHz effective speed should have a real clock speed of 1187.5 MHz, 21.0GHz one would only run at 1312.5 MHz real clock. This means that the power consumption required for a certain throughput will be slightly better with the GDDR6X compared to GDDR6, although the difference will not be significant. In practice the power consumption will probably increase overall because the rise in bandwidth will be greater than the increase in efficiency.

The GDDR6X has a slightly better energy efficiency at the same throughput as the GDDR6 (Source: Micron)

The use of PAM4 probably complicates the technology quite a bit and PHYs will not be easy to design, but the reduction of real clocks probably makes the work easier. Therefore, further growth should be possible in the future. Micron says they can introduce GDDR6X memory at 24.0 GHz next year. With a 384-bit bus, this would achieve 1152 GB/s. Next year, GDDR6X chips with a capacity of 2 GB per piece are also to be produced; while only 1GB chips will be available this year.

Exclusively for Nvidia graphics cards?

GDDR6X is a Micron’s solo project, so the memory will only be available from them and not from Hynix or Samsung. Because it was co-developed with Nvidia, it is also possible that the GDDR6X will only ship in their graphics cards while Radeons will not be able to use these memories – either because Nvidia has secured exclusive right for the technology or because AMD wasn’t aware of this memory in advance.

In the following years, however, a similar technology with the PAM4 encoding could be standardized as the GDDR7. However, the development of such memory has not been reported yet, and because the standardization takes a while, its arrival is probably still a few years away.

Next generation of Nvidia GeForce graphics cards is almost here. The reveal is going to be on September 1th, but over the last weekend we learned interesting new information about them. We have some parameters for the RTX 3080 and the RTX 3090 SKUs, but more importantly, it turns out that the Ampere GPUs come with a completely new memory. Micron has confirmed that they have developed special GDDR6X chips for Nvidia that will increase frequencies and bandwidth by 50% over GDDR6. So how is it shaping up?

Thanks to the leak of the document from Micron, the GDDR6X is probably a confirmed fact for Ampere graphics. Not much is yet known about the other parameters though, or at least not from sources reliable enough . Recently however, preliminary parameters for the top two SKUs have come to the surface, although they are still incomplete.

Rumoured GeForce RTX 3090 specs: 24 GB memory

According to leaker _rogame, the most powerful Ampere SKU should probably be GeForce RTX 3090. It reportedly has 24 GB of GDDR6X memory, not 12 GB as Micron says (but maybe both variants will be available). However, the bus would be 384-bit in either case. If the effectively memory frequency were 19.0 GHz, then we would get a bandwidth of 912 GB/s, and if the memory ran at 21.0 GHz, we’re at 1008 GB/s. It’s possible that the frequency is somewhere in between these values, of course.

According to _rogame, the RTX 3090’s clock is reportedly set at 1410 MHz in the base and 1740 MHz for the boost. Of course, so far we have to take it with a grain of salt. Unfortunately, it is still not known how many computing units/shaders the GA102 and the GeForce RTX 3090 chip contains – there were speculations about 5376 or 5248 shaders, but do not take it for a fact, we probably rather have to admit that we do not know.

Rumoured GeForce RTX 3080 specs: 10 GB GDDR6X

The same _rogame found possible specs of the GeForce RTX 3080, in this case they are located in the UserBenchmark database. They are incomplete however. According to the entry, the GA102 GPU in this case has a GPU clock limit of 2100 MHz, but this is not what you will get as the actual clock when gaming or what Nvidia calls the boost clock. It’s just the internally set maximum for GPU frequency. In practice, the clock will be lower while gaming (and the official Boost Clock will probably be even lower). Again, it is not known how many computing units this GPU has – it is possible that it also uses the GA102 chip, but it would probably be significantly cut-down.

But what the UserBenchmark shows is, is the memory. The GeForce RTX 3080 also uses GDDR6X according to it, with this model it will run at effectively frequency of 19.0 GHz. The database detects 4750MHz clock, which is a quarter of 19,000 MHz. The capacity of this card is reported as 10 GB, which means ten chips, so the bus should be 320 bits. At the frequency specified, the card would have a memory bandwidth of 760 GB/s.

Leaked PCB: 24 GB of memory. Over 20 supply phases crammed around

It was reported a while ago that Ampere will have a relatively high TDPs. This is perhaps evidenced by photos of the PCB apparently belonging to te GeForce RTX 3090 card, which appeared on Friday. The PCB of the card seems to have 12 spots for memory chips on the back of the PCB, so if there are another 12 spots in the front, it would probably be a 24GB card.

At the same time, it can be seen on the PCB that the power cascade will be extremely strong and will be located in two rows running in full width of the PCB on both sides of the GPU and the memory (the VRMs will be crammed quite close towards the memory so let’s hope the PCB will be well cooled). Judging by the soldering contacts on the back, a total of over 20 phases are present; according to some leaks 20 phases out of these might be dedicated to powering the GPU.

Leaked PCB of the GeForce RTX 3090 with 24 GB GDDR6X (Source: Bilibili, via: VideoCardz)

The PCB is also equipped with three eight-pin power connectors (not yet with the new twelve-pin connector, that has been reported in rumours), which would imply a TDP between 300 and 450 watts. It’s not given it will be this much however, Nvidia could be overprovisioning the power supply to have some headroom. And we also don’t know for sure whether this is really a reference PCB from Nvidia or a AIB partner custom board intended for an overclocked card. According to the leaker, the photo might show a card from the Chinese vendor Colorful. If it is a custom design, then the brutal power delivery does not necessarily mean that the same will be the case even on cheaper designs with reference frequencies.

The official unveiling comes right after the end of August

As already mentioned, Nvidia should reveal Ampere on September 1. The company has already officially launched a countdown leading up to this date, at the end of which we can expect the unveiling of the GeForce RTX 3000 graphics cards. The first SKUs will probably not hit stores until a few weeks later than that, but we should find out about the new features, architectural details and perhaps even about prices and full specification in advance on that day.

English translation and edit by Lukáš Terényi

Continue: Specifications of GeForce RTX 3080 and GeForce RTX 3090 in the leaks so far