Ampere deep dive: what’s new in GeForce RTX 3000 architecture

New manufacturing process 8N: Samsung technology enhanced especially for Nvidia

In terms of hardware, September was a green month with the release of the new generation of Nvidia GPUs, GeForce RTX 3000. They are based on the new Ampere architecture. In this article we are going to discuss what’s new compared to Turing: the new SM architecture doubling the number of shaders, the manufacturing process and the characteristics of the two chips that have been unveiled so far.

Everything is new: manufacturing node, architecture, memory and connectivity

Ampere is a new generation of GPU in all aspects. It combines three innovations: a new manufacturing node, a new architecture of the blocks and GPUs, but also a new GDDR6X memory technology which differs from predecessors by using more efficient PAM4 signaling. We have already covered GDDR6X here so we’ll refer you to that writeup and skip it in this article.

Manufacturing node: custom modified 8N technology

It has been known for quite some time that Nvidia will start manufacturing GPUs at Samsung instead of TSMC, which is the most established of the silicon manufacturing foundires. Nvidia has already used Samsung’s 14nm process (14LPP) for some less powerful Pascal chips (GP107 in GeForce GTX 1050/1050 Ti). So when there were reports last year that Nvidia would use Samsung’s process for next-generation chips, it was first thought that it would be something similar.

In the end it turned out quite differently. Nvidia actually produces the most powerful GPUs, GA102 and GA104 units, at Samsung and only the computing GA100 die for servers is manufactured on TSMC’s 7nm process. However, the gaming Ampere is specific not only because it is from Samsung, but also because it is not 7nm. Nvidia has chosen an older Samsung technology, the so-called 8nm process, which is an improved version of 10nm process node in fact.

Slajd, kde Nvidia oznamuje použití 8nm procesu 8N

The process is called 8N because it is specifically designed for Nvidia’s use and should include various unspecified improvements and modifications. It should therefore be a better technology than the 8LPP process used by some mobile phone SoCs. According to some sources, modifications done for Nvidia result in up to 10% higher performance (meaning higher frequency could be achieved) than the original version of the process that was focused more on mobile SoCs. However, this process probably achieves lower energy efficiency and transistor density than the real 7nm process, which in the case of Samsung is employs EUV lithography. It’s likely that the 8N process is also inferior to TSMC’s 7nm node.

Nvidia reports up to a 1.9x increase in energy efficiency in its marketing materials, but this is greatly exaggerated. Actual comparison of performance and power draw for GeForce RTX 3080/3090 graphics reveals significantly worse power efficiency improvements during gaming (approx. 1.1–1.3× depending on the circumstances). Note that if Nvidia would set the AMpere GPUs to the same TDPs as Turing chips, the efficiency would probably be much better (but that would result in lower final performance of the cards). Power efficiency in reality depends on how aggressively the chips in question are clocked, so it is not possible to give one exact value here.

Slajd, where Nvidia claims that Ampere has 1.9× better energy efficiency. However, it seems to be a comparison of underclocked GA102 with TU102 on a standard clock speed, which is unrealistic in practice

In any case, the process is just an implementation detail that is not directly important for the user, we only need to evaluate how well will the resulting chips perform. The 8nm process is despite all still a generational leap from the 12nm technology used in Turing (because the 12nm TSMC process used there is just a tweak of the 16nm technology used in the Pascal GPU).

More information: Specifications, prices, performance of Nvidia GeForce RTX 3090, 3080 and 3070 graphics cards

It is possible that Nvidia originally expected slightly better energy efficiency from 8N, because the Ampere GPUs have an unusually high TDPs (320 W for GeForce RTX 3080, 350 W for RTX 3090). It is theoretically possible that the power draw overshot original projections, but we’ll probably never learn for sure – Nvidia could easily target higher power draw from the beginning as well.

GPU GA102, a version of GeForce RTX 3080 (Source: techPowerUp)

The choice of the 8nm process was almost certainly intentional, the notions that there was a 7nm shortage for Ampere and the 8nm process was an “emergency solution” are unlikely to be correct. Nvidia probably used 8nm technology because it offered significantly lower costs per die, even after the bigger die size is taken into account. The choice of the 12nm process for Turing (instead of 10nm) in 2018 was likely a similar case. The fact that Nvidia chose Samsung instead of TSMC probably brought further savings. Samsung aggressively trying to gain clients for its foundry business might have led it to offer a generous discount to Nvidia, while TSMC had enough other clients and had no incentive to reduce its margin by pricing similarly low.

In any case, this should mean that Ampere cards are relatively inexpensive to manufacture in terms of GPU dies (but not necessarily in terms of PCBs, VRMs and memory), which will be important factor for competiing with 7nm Radeons whose chips are probably more expensive even though their die sizes will be smaller.


Flattr this!

Leave a Reply

Your email address will not be published. Required fields are marked *