Extreme TDPs for GPUs? Nvidia Hopper is allegedly above 1000W

Compute GPUs for servers up for a very steep rise in power consumption

It seems that after the power consumption increase brought by Nvidia’s latest GPU (GeForce RTX 3000) was not the last time we’ll see this happening. We had news of 450W GeForce cards and even 550W future SKUs coming, but the crowning jewel will be the high-performance server GPU codenamed Hopper. Nvidia is supposedly planning a four-digit consumption exceeding 1000 W, requiring a completely new power supply solution.

This information was shared by the leaker Kopite7kimi, who in the past has had very precise information about the parameters of the upcoming Ampere GPUs (and it was significantly in advance, too), which means he likely has good insider sources. Now he has mentioned two details about the forthcoming Hopper GPU, which is a future Nvidia architecture designed for compute servers and HPC – therefore being mainly an AI accelerator that possibly might not even have any graphic functionality (at least the RT cores for ray tracing will almost certainly not be included).

Nvidia is seemingly planning on splitting the architectures again. For compute tasks the GPU Hopper is being prepared , supposedly using chiplets (meaning the package has multiple interconnected silicon diesforming the whole GPU), while the gaming GeForce graphic cards should be based on monolithic chips with the Lovelace architecture. Both should utilise a 5nm TSMC production process and might be introduced in 2022.

Hopper might actually include quite a bit of dies, because according to Kopite7kimi the power consumption should be „1xxx“ Watt. So far it is hard to determine how much that is exactly. Theoretically it might be anything in the range from 1000 to 1999 W. With 1000 W already being a huge leap (current A100 accelerators have a 400W TDP) we presume that the Hopper accelerator will not climb up to a 1999 W TDP. If the TDP is greater than 1000 W, it is more likely to be something in the 1100 or 1200 W scale.

This might look scary, but one should remember that we are dealing with a chiplet design. Until now Nvidia’s GPUs have been monolithic, which meant that the chip could have a die size of about 800-820 mm² at maximum (due to reticle limits). Despite this the consumption has reached 350 W (current GeForce RTX 3090) to 400W (Nvidia A100). It is likely possible that the Hopper accelerator will use chiplets that are quite a bit larger compared with for example the Zen 3 CPU complex in AMD processors. They could insteadl have a size a approaching regular high-performance GPUs with about 400-800 mm². Multiple such chiplets could be integrated into the resulting package.

It therefore might be that the Hopper will have an insane power consumption of e.g. 1200 W but that this would be the power consumption of a device that is equal to e.g. four separate 300 W GPUs of the previous generation. When viewed in this way, the power-guzzling of the design would actually not be that bad, it is just that the compute power and power consumption is being concentrated into a smaller area.

An older Nvidia study has proposed a four chip MCM GPU with coherent inteconnects and a NUMA model (Source: Nvidia)

48V power supply in servers

Bbut this kind of integration will have tough requirements on cooling (which as will likely have to be liquid based, if not for example imersion based, which already has occasionally been used in HPC ) and power supply. According to Kopite7kimi Nvidia is changing its power supply system from 12V ATX PSUs or related server technology to 48V PSUs. Hopper will therefore require a new type of power supply units. But since such GPUs will likely have a special mezzanine design, not the standard form of PCI Express cards, this isn’t actually a huge hurdle, since a completely new server mainboard and chassis design is needed anyway. A 48V power supply is already being prepared for servers (being featured in plans leaked from Intel), so it will not be something that has to be introduced just for Nvidia.

The quadrupled voltage allows for supplying the components with high power inputs more effectively, theoretically with the same current and same wires four times more watts could be transferred, while by sticking with 12V significantly more wires would have to be added.

At this point it would be good to mention that we are talking about GPUs designated for compute utilisation in servers, not graphic cards for gaming PCs. With GeForce RTX 4000 or 5000, we do not need to fear a 1000W power consumption. However, it perhaps remains an open possibility that these cards could too one day migrate to a completely custom design instead of the currently used PCI Express ×16.

Nvidia is preparing an analogous, though smaller increase in power consumption for gaming graphic cards as well. A model with a 450 W consumption is said to be coming soon (GeForce RTX 3090 Ti), and the next Lovelace generation might even include 550W cards. Although these GPUs should still be monolithic. It is questionable whether Nvidia plans on releasing some very big and powerful chiplet GPU configurations for the gaming sector after that as well (naturally, this would likely require the cards selling at quite high prices). Such designs might reach even higher power consumption. In such case, it might be possible that the manufacturers might want to introduce 48V power supply as well as some completely new custom form factor and means of connecting such GPUs to a desktop PC’s motherboard. After all, for a long time, 300 W has been considered as the limit for a PCI Express card’s consumption and higher numbers have been more or less out of spec. That notion has obviously have been abandoned by GPU makers, but at some point the PCI Express card form factor might stop scaling to higher and higher power consumptions.

Source: Kopite7kimi

Jan Olšan, editor for Cnews.cz

English translation and edit by Karol Démuth, original text by Jan Olšan, editor for Cnews.cz

  •  
  •  
  •  
Flattr this!

Leave a Reply

Your email address will not be published. Required fields are marked *