Nvidia unveils DLSS 3.5: Better ray tracing not only for RTX 4000

AI Frame Reconstruction: How does it work?

Nvidia has now announced a new iteration of its DLSS AI upscaling technology, following on from the third generation or DLSS 3 from last year. However, the new DLSS 3.5 is somewhat confusingly named, as it is to some extent more of a continuation of DLSS 2.x – this improvement will not depend on DLSS 3 (also referred to as Frame Generation). That means it works on older GeForce RTX 2000 and RTX 3000 generation graphics cards.

The new DLSS 3.5 has one central new feature, which Nvidia has named Ray Reconstruction (we’ll see, maybe in the future this name will be used more than the DLSS 3.5 designation). The goal is to improve the image quality of ray tracing, and what it boils down to is replacing the denoisers that are used during ray tracing.

Denoiser usage during ray tracing

As you probably know, to render a scene using ray tracing, you need to analyze a large number of light rays hitting and getting reflected from objects. The problem in games (but basically also in offline rendering of static scenes and movies computed over a much longer period of time than a game frame) is that there is simply not enough performance to compute as many rays as would be needed.

Therefore, only a relatively small number of rays are analysed. The way you can think of it is that instead of a nice final picture, you get an image with not a continuous image, but just individual points forming a kind of noisy image with gaps between them.

A schematic of traditional raytracing rendering

The game realtime implementation of raytracing in DXR (DirectX Ray Tracing) has used denoiser filters from the beginning to smooth and fill out this image, suppressing those discontinuities and allowing it to be used in the game. There are different types of denoisers and they can use both temporal (multiple frame processing) and spatial (i.e. smoothing only based on data within a single frame) techniques.

How denoisers work in ray tracing

To be honest, I was actuallyunder the impression from previous presentations that Nvidia was already implementing these denoisers for ray tracing with neural network or AI, but now the company says that these denoisers are still being implemented in games as traditional “hand-designed” algorithms, or sometimes combinations of several such algorithms.

Ray Reconstruction using AI

From that startpoint, the DLSS 3.5 or Ray Reconstruction technology does a simple thing. Because this task is one of those for which the “black box” nature of artificial intelligence is well suited, Nvidia has done just that, and DLSS 3.5 provides a special neural network to use at this point in the game rendering, replacing the work of the traditional denoisers. The neural network is trained on a corpus of clean and noisy images for this purpose, similar to how it is trained on pairs of original and downscaled images for upscaling. Once trained, it should perform better than traditional denoisers, according to Nvidia.

This AI de-noising filter is similar in operation to DLSS 2.x – it performs both de-noising and upscaling on the raytracing lighting image data. It uses various data from the game engine to perform enhancement processing on the input rendered frames, but in this case it works on the raytracing lighting image data instead of the final scene’s frames. According to Nvidia’s presentation, the filter is temporal (uses and combines data from multiple frames, like 3D denoisers) and uses motion vectors – it puts together several consecutive past frames for temporal filtering, and by doing this it can also restore some detail that would otherwise be lost in the low resolution processing game use for raytracing effects.

Schematic of raytracing pipeline with Nvidia DLSS 3.5 (in the diagram there is also frame generation alias DLSS 3, which is not part of the DLSS 3.5 process)

Integration with DLSS 2.x

An important detail is that this de-noising AI seems to be one common unit with the upscaling AI used for DLSS 2.x, one single model appears to perform both functions. What this should be helpful with is that the AI has more information available to do its work. If the operation of these two steps was separate, that DLSS 2.x upscaling step could end up doing a worse job of upscaling lighting effects and detail due to the denoiser running before it having deleted (smoothed out) some detail and information from the input. An AI integrated together in this way can however hold on to such information from earlier steps and still be able to apply it as input to its decision making in later processing steps.

Nvidia claims that using this AI within DLSS 3.5 will improve image quality, as the denoiser and its temporal function will be able to preserve some extra detail while preventing some of the artifacts (temporal ghosting, or detail blurring) that current denoisers cause or are unable to prevent.

However, as with other such AI techniques, it should be remembered that we are talking about a rendering technique that works on the principle of approximation and all these AI techinques are largely about “guessing and making up” visual data from limited and missing image information, so it cannot magically deliver a perfect result. The goal of DLSS 3.5, as with other DLSS iterations, is to achieve better visual result within some performance constraints. But still, various artifacts and imperfections can (or rather will) occur in the output. After all, all versions of DLSS have undergone and continue to undergo evolution, which is precisely about incremental improvement and mitigation of various flaws and artifacts.

DLSS 3.5 does not imply DLSS 3

According to Nvidia, DLSS 3.5 as we have just described it should work on all GeForce graphics cards with tensor cores, i.e. on GeForce RTX 2000, 3000 and 4000. It doesn’t require the new dedicated special-purpose units from the Ada Lovelace generation GPUs, unlike DLSS 3. Nevertheless, beware that “DLSS 3.5” in this sense doesn’t mean a fully replacement (or superset) of DLSS 3, although the naming convention implies this.

The Frame Generation technology (that inserts frames interpolated from the game’s frames that aren’t rendered by the actual game), which until now has been referred to as DLSS 3, will still require the dedicated hardware of GeForce RTX 4000 graphics cards. The announcement of DLSS 3.5 does not mean that you will now get Frame Generation with GeForce RTX 2000 and 3000 generation graphics cards. In that regard, the designation chosen by Nvidia isn’t very fortunate.

Více: S RTX 4000 přichází Nvidia DLSS 3. Nová generace AI upscalingu generuje snímky, obchází limit CPU

In fact, DLSS 3.5 in terms of Ray Recontruction does not even require that DLSS 3 image generation be used at the same time, even though Nvidia’s diagram of how it works lists both of these techniques in the overall DLSS 3.5 “flowchart”. However, Ray Reconstruction needs DLSS 2.x to be active at the same time, in order to be able to do the joint AI denoising-upscaling.

In games in the autumn

According to Nvidia, the technology should appear in Cyberpunk 2077, of which the company showed a demo. It should also be seen in the Portal remake with raytracing effects and Allan Wake 2.

For games, that’s it for now, the other software stated to be getting the feature is related to graphics rendering outside of games – Chaos Vantage, D5 Render and the Nvidia Omniverse framework. These titles should be available in the autumn, so we won’t have to wait too long for the chance to test this.

Source: Nvidia

English translation and edit by Jozef Dudáš



  •  
  •  
  •  
Flattr this!

RTX Video HDR: Nvidia’s AI gives ordinary web videos HDR look

Last year, Nvidia introduced a feature called RTX Video Super Resolution, which uses the GPU to upscale and enhance web video with a DLSS 1.0-like filter utilising an artificial intelligence (though you can use this upscaler in VLC Media Player as well). This technology has now been extended to RTX Video HDR, which is again an AI filter that recreates (simulates) an HDR component for an ordinary video, adding high dynamic range visuals. Read more “RTX Video HDR: Nvidia’s AI gives ordinary web videos HDR look” »

  •  
  •  
  •  

Amazon unveils 96-core ARM Graviton4 CPU and Trainium2 AI chip

Last month, Microsoft unveiled their first custom processors being developed for datacenter and Azure services. Also Amazon, which was the first of these US hyperscalers to go the custom hardware route, is now launching new CPUs for its servers. And with it Trainium2, already the second generation of an in-house developed AI accelerator. Amazon also revealed that it has already produced over two million of its CPUs. Read more “Amazon unveils 96-core ARM Graviton4 CPU and Trainium2 AI chip” »

  •  
  •  
  •  

Nvidia’s new fastest AI GPU: H200 with 141GB of HBM3E memory

Last year, Nvidia launched the 4nm H100 accelerator with Hopper architecture. It has since been the company’s fastest GPU for AI. Now the company is launching its successor dubbed H200. It isn’t quite a new generation yet, but something of a refresh that will lead Nvidia’s lineup until the next generation with the Blackwell architecture is released. The H200 relies on the use of faster memory, but that should also lift overall performance. Read more “Nvidia’s new fastest AI GPU: H200 with 141GB of HBM3E memory” »

  •  
  •  
  •  

Leave a Reply

Your email address will not be published. Required fields are marked *