What does GeForce RTX accelerate in DaVinci Resolve Studio 18?

Ľubomír Samák

7 months ago

What is it about?

DaVinci Resolve (Studio) video editing applications are highly optimized for hardware acceleration by GeForce RTX graphics cards. These can dramatically reduce the time of some tasks, turning hours into minutes or, for larger projects, days into hours. We’ll take a look at what exactly this is about in a two-part miniseries dedicated to streamlining work in Black Magic Studio video editors with NVIDIA Ada Lovelace GPUs.

Disclaimer: This article was commissioned by NVIDIA and is paid. However, the company did not interfere with its content in any way, and the only requirement was to introduce users to the GPU acceleration capabilities in DaVinci Resolve (Studio) using GeForce RTX graphics cards.

NVIDIA Studio?

Computer configurations that meet certain computing performance requirements are included by NVIDIA in the so-called “Studio” family (NVIDIA Studio). There are a number of things required to obtain such status, including the requirements for the processor used, but that is beside the point for the purposes of this article. For now, suffice it to say that as long as the other requirements meet the minimum criteria, the last piece of the puzzle is currently a graphics card at least at the RTX 3050 level. There is no upper limit and the more powerful the GPU, the more computing performance you will naturally get.

Instead of “gaming” GeForce graphics cards, “professional” models (from A4000 upwards) can also be used. The presence of tensor (AI) cores is always key.

A complete list of graphics cards for “NVIDIA Studio” can be found in a table at nvidia.com, where individual models are categorized by computing performance into three groups – Creative Dabbler, Creative Maestro, and Creative Powerhouse.

Don’t overlook: A wide range of different graphics cards with NVIDIA Studio support as well pre-built PCs or laptops are also in stock at the partner of this article – the eshop smarty.cz/smarty.sk.

Resolve vs. Resolve Studio…

Black Magic Design has two variants of DaVinci Resolve. One free (DaVinci Resolve) and one that comes with a paid license (DaVinci Resolve Studio), the latter with the “Studio” moniker. Although even the basic DaVinci Resolve is a feature-heavy application even for more advanced users (video editors), DaVinci Resolve Studio has a few useful extra things. Ones that mostly have to do with just the advanced hardware acceleration capabilities and the application of the AI cores of NVIDIA graphics cards.

In addition to supporting higher output resolution (up to 32K versus 4K in the free Resolve) combined with a higher frame rate (120 fps versus 60 fps, for example), it’s also a number of useful tasks that include an engine that uses neural networks. Of course, there are a number of differences between the studio and “ordinary” builds, but for the purposes of this article we’re mainly interested in those tasks that are noteworthy in terms of acceleration with modern GeForce RTX graphics cards.

… and neural engine in newer versions

New since DaVinci Studio 18.5 are hardware optimizations for the neural engine. These take place after initial startup and while they can be skipped, don’t do this with supported hardware. Doing so would deprive you of the computing performance available to you. In the settings (these are invoked with the keyboard shortcut Crtl + ,) on the GPU and Memory tab, make sure that the graphics card detection has been done correctly.

From the tasks that utilize AI units, we have selected five popular ones (Magic Mask, Smart Reframe, Face Refinement, Optical Flow a Super Scale), that we will introduce in more detail in the following chapters of this article. Besides these features, there are others that can be accelerated by the power of the GPU, and the good news is that the team around Puget Systems is including them in their tests in build 0.98.0 Alpha. This is, at least for now, not public info (perhaps because it’s a really new, untweaked thing from late September 2023), but if that changes (and Puget Systems publishes this benchmark), we’ll be happy to include these tests in our standard graphics card testing methodologies.

AV1 (and HEVC) encoding

Finally, we’ll take a look at the AV1 encoding capabilities of NVIDIA’s NVENC encoder (4K@30 and 8K@30), which is one of the multimedia innovations for the current generation of GeForce RTX 4000 graphics cards (RTX 3000s with the Ampere architecture don’t yet support AV1 encoding).

And for comparison (RTX 4060 and RTX 4090), we’ll also reach for HEVC later to compare encoding speeds with an Intel Core i7-14700K GPU (UHD 770) without AV1 encoding support (Raptor Lake Refresh still only handles AV1 decoding).

Please note: At the end of this article, we have discussed the background of the AV1 video format in more detail. If you’re interested, don’t overlook this chapter.

⠀

DaVinci Resolve (Studio) video editing applications are highly optimized for hardware acceleration by GeForce RTX graphics cards. These can dramatically reduce the time of some tasks, turning hours into minutes or, for larger projects, days into hours. We’ll take a look at what exactly this is about in a two-part miniseries dedicated to streamlining work in Black Magic Studio video editors with NVIDIA Ada Lovelace GPUs.

Magic Mask

A tool for quickly selecting the subject in the scene you want to “mask” (color mask, but also changing the depth of field if necessary). For our purposes, we will work with a short, 28-second clip at 4096 × 2160 px resolution with a bitrate of 91.5 Mbps.This sample is freely available for download, so you can easily compare our results, which we will publish as part of tests, with the situation on your build.

After importing the project, set the Magic Mask filter in the panel below the preview. And then in the advanced settings check the options Person Mask, Add Stoke, Invert Mask, Toogle Mask Overlay and Quality to Better. As highlighted (in green) in the screenshot below.

Then, for example, you draw a line on the model’s neck that applies the mask to the background…

… and after pressing the Track bi-directional button in the navigation above the timeline you measure the time it takes to compute all the frames of the video.

Smart Reframe

With Smart Reframe, the suitable video can be reformatted from a traditional (landscape) view to a vertical view. This is typically suitable for playback on a smartphone without the need to rotate the display.

The conversion (typically suitable for selected social networks) works by marking a moving object for the camera to focus on in a traditional horizontal video. The rest is removed during processing. We use an 8K video (8192 × 4096 px@24, 89s, 139.8 Mbps) for testing. You can find it at this link.

In the video, we’re going to watch a water bubble. You put the object you want the camera to be interested in into the Smart Reframe selection (its settings will appear by expanding the Inspector option in the upper right corner of the GUI)…

… start the operation by pressing the “Reframe” button.

Face Refinement

Face Refinement is essentially augmented reality make-up. It can modify the face of the person in front of the camera in a variety of ways without the need to stand in front of a mirror for a long time. How authentically, we’ll leave that to the judgement of those who understand it more than we do.

For our purposes, we won’t go into the artistic side of things, and we haven’t edited anything on Ms. Amber, from whose video Fall Sweater Plans and Projects, Vacation Knitting, New Yarn and Plans To Use It on the YT channel A Lovely Yarn we’ve borrowed the opening 148 seconds.

Computational performance is required for face analysis, based on which the selected filters are then applied. As the face part of the image changes its position, recomputation occurs continuously throughout the recording. This is necessary to ensure that cosmetic adjustments to the face are rendered correctly, without obvious flaws.

The face mapping begins with an instruction to analyze.

With the overlay layer view enabled, you can see exactly how this feature replicates the main features of the face.

Optical Flow

Slowing down video with low fps? It can be done without lowering the smoothness below the usability threshold. That is, as long as the missing frames are computed. For this experiment, we reached for a 4K video at 24 fps – the LG Snowboarding Demo (downloadable here).

After the speed is reduced four times, in this case to 6 fps, the image is naturally jerky. However, this can be eliminated by deploying Optical Flow, which returns the video back to the original 24 fps while keeping the video slowed down as if it were recorded by a high-speed camera. For the best result, which is also the most computationally intensive, we set Motion Estimation to Enhanced Better.

What we’ll be looking at for this test is the frame rate of the live playback.

Super Scale

The image quality of the videos used so far has always been relatively high, but this time what will be worked with reaches a bitrate of only 2.33 Mbps, and there is a lot to improve in the image of the I Jumped From Space (World Record Supersonic Freefall) video. So it’s a good fit for DaVinci Studio’s “Super Scale”, which, in addition to upscaling the image, also sharpens and reduces noise.

We set the effect of both of these tasks to high, and in a week’s time you’ll see how the differently performing GPUs hold up in a performance comparison.

If you would like to test the performance yourself and compare your setup with our results, here are the output settings. For the QuickSync format, the H.265/HEVC codec and an encoder according to the GPU are used. For NVIDIA graphics cards, it is naturally NVENC.

AV1 and HEVC encoding

The new GeForce RTX 4000 cards introduce the capability to performa hardware video compression to AV1 format, whereas previous generations of cards only support HEVC format. This is significant as the NVENC encoder is expected to perform better when using AV1 compression compared to using HEVC compression.

Higher compression quality in this sense means that you achieve a certain level of visual quality at a lower bitrate than you would need when outputting in HEVC format, so the file will take up less space or need a lower bitrate when streaming.

Alternatively, you can also achieve better detail reproduction and fewer compression artifacts at the same bitrate. With two equally sized video files, the one in AV1 format should look better.

Meanwhile, NVENC’s performance when compressing to AV1 should be comparable to HEVC, so these benefits are not at the expense of processing speed.

Overview of video format support for NVDEC/NVENC decoding and encoding on GeForce RTX 4000 graphics (Ada)

Dual compression engines for faster compression speeds

GeForce RTX 4000 generation graphics cards have one more advantage when compressing to AV1 format. Their GPUs provide two parallel NVENC compression engines. If the video has a higher resolution (4K or higher), these engines can be combined to work on the same video, increasing compression speeds by roughly double compared to a situation where only one engine would be used.

This should be making use of the tile feature in the AV1 format, where video is processed in two (or at least two) tiles forming a single frame. Similarly, on the GeForce RTX 4000 it should also be possible to speed up encoding 2× when using the HEVC format, where the dual encoder can also be used in 4K and higher resolutions (with HEVC the frame is not divided into tiles, but into multiple “slices” instead, but this still allows parallel encoding). The use of this technique does not in any way degrade the compatibility of the video with decoders in devices or video players, everything is within the standards, which allow for multiple tiles or slices per frame.

For the performance tests, we have (and so do you, but beware, this is a large, 60-gigabyte archive) two 30 fps videos in 4K and 8K in Prores422HQ with very high bitrates (805 and 3056 Mb/s).

The output settings for each codec (AV1 and HEVC) are adjusted so that the same bitrate is achieved across different encoders (NVENC and QuickSync). The ICQ of QuickSync is reduced to 12 compared to the default value (26), and according to the bitrate of such video, the bitrate for NVENC is set manually by specifying a limitation to the desired 67 Mbps.

In addition to 3840×2160 px resolution, performance will also be measured at 7680×4320 px (with a bitrate of 158 Mbps).

⠀
DaVinci Resolve (Studio) video editing applications are highly optimized for hardware acceleration by GeForce RTX graphics cards. These can dramatically reduce the time of some tasks, turning hours into minutes or, for larger projects, days into hours. We’ll take a look at what exactly this is about in a two-part miniseries dedicated to streamlining work in Black Magic Studio video editors with NVIDIA Ada Lovelace GPUs.

If you haven’t been following the field of video compression formats much, you may not know where AV1 came from. This format is an alternative to the lineage of formats developed by the ISO MPEG and ITU JVT consortia that pretty much introduced the mainstream of modern video formats based on block DCT transformation, I, P and B frames with motion prediction, in-loop filters and other tools as we know it. The formats that have emerged from this circle are MPEG-1, MPEG-2, the so-called DivX/XviD (MPEG-4 ASP or also H.263), H.264 (MPEG-4 AVC), H.265 or HEVC, and most recently H.266 or VVC.

However, these formats have traditionally come with usage fees (typically owed by the chip or program manufacturer that uses them) to be paid back to the various companies involved in the development of these technologies and their standardization. This model has long been opposed by the open-source software camp on the one hand, and by large Internet companies on the other, who want video formats with roylty-free usage. AV1 is a product (essentially a competitor) from this part of the video and software industry.

Google first produced the VP8 (acquired as a whole with On2) and VP9 (which it uses on YouTube) formats. These formats are essentially patent-circumventing counterparts to H.264 and H.265, to which they are technologically very similar but are considered somewhat inferior.

After VP9, however, Google in particular again initiated the development of a new format that was already aimed at bringing not only an alternative to HEVC, but potentially a technology more advanced or more powerful in terms of compression. In addition to Google, this time CISCO and other entities (including Mozilla and Xiph for the open-source sphere) were also involved, forming Alliance for Open Media (AOM). The resulting format, like VP8 and VP9, although covered by patents, is not associated with any royalty fees, which is advantageous when used by Internet video services. The reference encoders and specifications are open source.

Genetically, AV1 is mainly based on VP9, but with added “next-gen” compression techniques. While VP9 had fewer tools compared to HEVC and was less complex, the reverse is true for AV1, which purely technologically has more various techniques for prediction (in particular intra prediction has been enhanced, but also inter prediction using motion vectors), more complex in-loop filtering and transformations than HEVC. In AV1, one can see either a next-gen successor to HEVC (i.e. a competitor to VVC) or at least a format half a generation ahead, possibly belonging somewhere between the HEVC and VVC generations (but VVC hasn’t really caught on yet, and Google is also at least prospectively planning a next-generation AV2).

AV1 format presentation at the launch of version 1.0 (2018)

It has to be said that video quality is not just about the format specifications and its compression techniques. Equally important is the ability of the compression software (encoder) to make optimal use of these techniques. Indeed, all possible prediction and quantization modes give a huge space of different combinations among which the encoder has to choose, and this choice is non-trivial. It cannot be guided only by the metrics of similarity between the result and the source, but also by the influence of non-local optimization of the whole image and by psychovisual factors. At the same time, not all possibilities can be explored and shortcuts (early termination of search, skipping some analysis) and approximations must be used. H.264 and HEVC used to be ahead in encoder quality, so even today HEVC can sometimes compete with AV1.

However, when we only consider the area of hardware encoding, there is no longer as expected that an older format (such as HEVC) is favored by a better, more mature encoder against a newer format with less mature software encoders (AV1 with libaom, SVT-AV1). In the area of GPU hardware encoding, the new format should therefore have the potential to immediately improve compression efficiency and thus image quality. That being said, this is also exactly what it should look like when encoding on graphics cards, where the NVENC encoder should achieve a better result when using AV1 (versus HEVC) – better quality at the same bitrate, or lower bitrate at the same quality.

English translation and edit by Jozef Dudáš

Continue: Magic Mask, Smart Reframe