RDNA 3 the same as RDNA 2? Wrong, computational tests show

The difference between RDNA 3 and RDNA 2: Unjustly underestimated?

We’re still in the process of finishing our review of the Radeon RX 7600 (the Pulse model by Sapphire), but we made a little preview, looking at a subset of tests that might go unnoticed in the final review: compute applications benchmarks. Why? The card seems to show better than typical performance gains in them. We’ve seen opinions stating that Navi 33 GPU bingst nothing new versus the Navi 23 chip, but these results say otherwise.

Going through the results from the hundreds of tests we have in the test charts, we noticed that in comparison to how the Radeon RX 7600 typically performs in games (more on that in the full review), the card achieves surprisingly high performance gains come out in some – though also not all – computing applications. That’s not to say that this cheap card is somehow suddenly crushing the GeForce RTX 4090 or anything like that, but these improvements against the Navi 23 based cards (Radeon RX 6650 XT, RX 6600 XT…) simply paint a more favorable picture than games do, percentages-wise.

We don’t have any exact analysis that would explain this with certainty, but the most plausible explanation is that this is actually the new RDNA 3 architecture at play, along with the changes it makes against the RDNA 2 architecture found in the cards with the Navi 23 chip. We’ve covered those architectural updates here.

The Compute Units (CUs) of the RDNA 3 architecture have been significantly redesigned. Since the first GCN GPUs nearly 11 years ago, the GPU block has always had 64 shaders (“ALUs”). RDNA 3’s new thing is that it gives the SIMD units that provide those “shaders” the ability to process two instructions per cycle. As with RDNA 2, two 32-wide SIMD units are used for this, but with the newly added ability to “dual issue”, i.e. process two instructions simultaneously. They thus have a theoretical compute performance of the equivalent of up to 128 shaders instead of the current 64.

However, this duplication of “ALUs” or shaders is done within the CU structure based on previous generations, so that 64 of these dual-issue shaders, which could theoretically do the work of 128, share some of the control and computate structures that were serving 64 shaders in RDNA 2. Also important is that this dual-issue capability is still quite inflexible and has various complicated constraints and requirements on the instructions for them to be executed simultaneously. Therefore, in practice, 64 of these RDNA 3 dual-issue shaders have a much smaller resulting performance than you would get from 128 RDNA 2 shaders. So while this doubles the theoretical TFLOPS, in practice the performance yield extracted will be much smaller than 2x – usually.

In games, the benefits of this architectural change have indeed been quite limited so far, as can be seen in game tests, where the “IPC” (by which we mean GPU performance per shader ALU and per 1 MHz) does not seem to have increased much with RDNA 3. AMD drivers clearly have trouble consistently making use of dual-issue capability consistently in games with their large pool of different shaders that need optimization.

However, the fact we observe relatively better performance boost with the Radeon RX 7600 running compute applications could indicate that such programs are able to take advantage of dual-issue more often. This may be because the amount of code loops has smaller diversity and is easier to target for optimization in the scope of the driver’s compiler or even in targetted hand-tuned driver optimizations, which may make code Blender’s main rendering codepaths and loops easier to optimize and get them to execute some instructions in dual-issue mode.

The above-average performance gains with the RDNA3 architecture in our tests are probably evidence that, at least in some cases, dual-issue is already being used in compute applications. There is one thing that supports this idea. An analogous situation occurred with Nvidia’s Ampere architecture. It also doubled the number of FP32 shaders, though the technical details were different there. But even then, the real benefit in games was quite limited and far from seeing doubled performance. And perhaps similar to today’s situation, the benefit of the FP32 doubling was much more pronounced in some compute applications than in games. You may still remember the big performance gains in various Cuda and OpenCL applications that were shown in 2020 reviews. So this points to the nature of compute applications being somewhat different compared to games, which may be behind the atypical performance gains in compute applications on the RDNA3 architecture in the Radeon RX 7600 and Navi 33 chip.

So think of it this way: The RDNA 3 / Navi 33 architecture may have some extra potential over RDNA 2 and Navi 23; thus it certainly can’t be said that it doesn’t matter which one you’ll get between the older and newer chip. The results of the compute applications may also point to some as yet untapped potential that could eventually trickle down into game performance after all.

So it’s possible we’ll see the oft quoted concept of „fine wine“ beneficial aging of the drivers actually come true to some degree, with RDNA 3 GPUs. Don’t take it to mean that the GPU will make some extreme performance leaps in a year or two. But some gradual RDNA 3 GPU performance improvement (let’s conservatively imagine for example 5–10% advances?) over time compared to RDNA 2’s performance could be quite realistic. But of course we can’t guarantee it.

So now, without further ado, some of the relevant charts that will be included in the full review, that we found interesting.

English translation and edit by Jozef Dudáš





  •  
  •  
  •  
Flattr this!

Windows 11 stops working on more processors, requires SSE4.2

This year, the vague uncertainty about Windows 11 not supporting older computers turned into reality, as the OS began using the POPCNT instruction, causing it to stop working on many processors. However, this was not all and the requirements may increase further. In fact, now Windows 11 is starting to require additional instruction set extensions that will shut down more processor families, including Phenoms and the first APUs. Read more “Windows 11 stops working on more processors, requires SSE4.2” »

  •  
  •  
  •  

RDNA 4 Radeon GPUs: specs and performance of both chips leaked

Previously, new GPU generations were coming in 2-year cycles, which would mean a launch this fall. However, Nvidia’s roadmap has put the GeForce RTX 5000 launch into 2025 some time ago. AMD is still unclear on the launch date of Radeon RX 8000s, but there’s some chance it’s within this year. The specs of these GPUs using RDNA 4 architecture have now surfaced on the internet. If they are real, it might even point to a release relatively soon. Read more “RDNA 4 Radeon GPUs: specs and performance of both chips leaked” »

  •  
  •  
  •  

AMD to produce lowend CPUs and GPUs using Samsung’s 4nm node

Back when the groundbreaking Ryzen processors launched, AMD was still manufacturing almost all of its products at GlobalFoundries, with the exception of chipsets designed by ASMedia. But now, by contrast, it is almost fully tied to the fortunes of TSMC. However, it looks like there could soon be some diversification in place. Samsung-made chips are coming to low-cost processors and they’ll also appear in Radeon graphics cards later. Read more “AMD to produce lowend CPUs and GPUs using Samsung’s 4nm node” »

  •  
  •  
  •  

Leave a Reply

Your email address will not be published. Required fields are marked *