KitGuru was recently given the chance to fire some questions over to AMD’s Scott Wasson on the topic of frame pacing for multi-GPU (mGPU) configurations using DirectX 12.
Frame pacing has long been an issue for enthusiasts wanting smooth gameplay from their multi-GPU configurations. High frame rates are one aspect of an enjoyable gaming experience, but it’s important that those frames are delivered in a smooth and consistent manner. This is where frame times have been put into the spotlight in recent years, with the industry demanding an approach to delivering frames with a consistent interval in order to create a smooth gameplay experience.
This interview discusses the frame pacing work that AMD is putting into its multi-GPU solutions in a DX12 gaming environment.
A Short Introduction to DirectX 12 from AMD’s Sasa Marinkovic:
DX12’s key benefit is probably how it allows developers to talk to modern graphics hardware on its own terms. GPUs’ hardware has matured in recent generations, and the older APIs were built in part around older chips that required different assumptions.
DirectX 12 should have less inherent CPU overhead, which could allow games to run smoothly on less expensive CPUs. DX12 also offers improved threading, so games can potentially take better advantage of multiple CPU cores. On the graphics side of things, DX12 supports a number of new GPU hardware capabilities that can speed up pixel processing and enable new rendering techniques. For instance, a feature called async compute shaders allows lightweight tasks to run concurrently with the main graphics thread, keeping those big shader arrays on today’s GPUs more fully occupied.
One area where DX12 offers more control is when a game uses multiple GPUs together to produce its images. If they wish, developers can explicitly assign work to GPUs of different brands and sizes in the same system, and they can combine the output from all of these GPUs into a final image. Developers also have the option of using more traditional GPU pairings where two same-sized GPUs share the load. In that case, AMD can offer an assist with a feature call frame pacing, which helps to ensure smoother animation with multiple GPUs. Today, AMD is announcing that it’s extending its support for frame pacing to DirectX12 applications.
In the YouTube video below, Scott Wasson discusses DX12 multi-GPU frame pacing with visual aids.[yframe url=’https://www.youtube.com/watch?v=voCapB43F0k&feature=youtu.be’]
From this point onwards, our questions were directed to AMD’s Scott Wasson.
– In both simple terminology and more technical terminology, tell our readers what DX12 Frame Pacing is and why it is so important for mGPU usage scenarios.
Scott Wasson – AMD:
You may have heard that DirectX 12 offers developers some new choices for dealing with loading balancing between multiple GPUs. The very nifty MDA [Ed. Multi-Display Adapter] mode allows developers to take explicit control of different sizes and brands of GPUs and use them to render a frame.
The frame-pacing feature we’re introducing for DirectX 12 deals with a more conventional scenario with LDA [Ed. Linked Display Adapter] mode, where two evenly matched GPUs divide up the work using a load-balancing method called alternate-frame rendering, or AFR for short. In essence, frame-pacing works to ensure smoother animation with AFR load-balancing.
To go a bit deeper, with AFR, the workload is divided across two GPUs in interleaved fashion. The first GPU renders the first complete frame, and the second GPU renders the next frame. Then we go back to the first GPU for the third frame, and so on.
AFR is by far the most widely used load-balancing method in DirectX 11 and earlier because it offers up to 2X performance scaling. But AFR has a potential drawback: the two GPUs can go out of sync. When that happens, frames from GPU 1 and GPU 2 can arrive at the display at almost the exact same time.
Without frame pacing, what you might see from AFR is a pattern of frame-to-frame intervals that looks something like this: 30 ms, 2 ms, 30 ms, 2 ms, 30 ms, 2 ms.
With this sort of elliptical display pattern, the second GPU isn’t adding much to the smoothness of the animation. Our frame-pacing feature corrects the timing of displayed frames, so the pattern would look more like this: 16 ms, 16 ms, 16 ms, 16 ms, 16 ms, 16 ms.
That corrected pattern can result in much smoother perceived animation. We’ve supported frame-pacing for older graphics APIs for a number of years. Now, we’re extending this support to DirectX 12 applications that choose to use AFR.
Desktop users with two matched Radeon GPUs, like dual Radeon RX 480s, should see the benefits of this feature in DX12 games that use AFR.
– How does AMD’s DX12 Frame Pacing implementation compare to what ‘the competition’ offers?
Our position is to support robust AFR load-balancing in DirectX 12 and to enable developers to use it across our product stack, including in the very affordable Polaris products, from the Radeon RX 460 to the RX 480. From what I’ve read, our competition has removed SLI support from this class of products, so comparisons are hard to make. I believe their more expensive GPUs have a frame-metering capability.
– AMD looked to be performing well in DX12 scenarios. Is frame time variance an issue that you have spotted internally and are now fixing at an early stage in the product lifecycle? Or does your Frame Pacing implementation bring other benefits with it ?
Delivering frames quickly and consistently is key to good gaming experiences, so it drives a lot of our priorities.
I believe the work the industry is doing to convert to low-level APIs like DirectX 12 and Vulkan is the single biggest way we can reduce frame-time variance and ensure good experiences. We helped get the ball rolling with Mantle, and we’ve collaborated extensively with the industry since then. AMD has also invested a lot of effort into drivers and software for Vulkan and DX12, and that work has been paying off, as you’ve noted.
Our GCN family of Radeon GPUs is especially well-suited for low-level APIs like DX12. Our advanced scheduling hardware allows async compute shaders to execute concurrently with the main graphics thread. Our support for shader intrinsic functions lets developers take hand-optimized GCN machine code from the game consoles and run it directly on our PC graphics chips. These measures can reduce overhead, improve throughput, and cut frame latencies. The work we did with id Software on Doom Vulkan is a nice example of what’s possible. I think we’re on the right path.
In fact, the frame-time results I’ve seen from our Polaris GPUs look very good across all APIs. Our software guys deserve a lot of credit for that, as does Microsoft for the good work it has done with Windows 10.
To be clear, though, the frame-pacing feature for DX12 only applies to a specific case involving multiple GPUs with AFR, so its benefits are limited to those scenarios.
– Will the DX12 Frame Pacing with mGPU have a positive effect on other important parameters such as raw FPS, GPU power usage, dynamic GPU clock speeds?
I think the best way to say it is that DX12 frame pacing should allow us to deliver better experiences at a given FPS rate and at a given GPU power level. Frame-pacing won’t improve your FPS average, but it can help ensure that 60 FPS feels as smooth you’d expect it to feel.
– Has your experience and learning from FreeSync helped you with the DX12 Frame Pacing feature? And how does this now impact your future strategies with FreeSync?
The biggest thing that I learned from FreeSync was what a huge factor the interaction between the display timing loop and the GPU’s timing can be. A lot of the bad experiences we as gamers blame on insufficient GPU performance are actually poor interactions with the display timing loop. You can see that because when you enable FreeSync, the issues go away. Sometimes, even FPS averages in the 40-50 FPS range can feel very acceptable with FreeSync. I didn’t expect the effect to be that dramatic.
Our DX12 frame pacing feature inhabits a similar domain. It can improve in-game animation without changing the FPS average. If you try it out, toggle it on and off in the Radeon Settings control panel, you can see and feel the improvement subjectively.
The folks working on FreeSync and frame-pacing are separate teams, but both teams are working toward the common goal of perceptibly smoother animation and better experiences. These are complementary technologies, so they should coexist quite comfortably going forward.
– Can the knowledge gained from implementing DX12 Frame Pacing with mGPU be used in your VR activities in order to help to smooth the frame time even further, thus reducing the feeling of nauseousness for some users?
VR has a similar set of goals involving consistent frame delivery, but its latency requirements are even tighter than with traditional 3D games. Also, VR is unique in that it requires rendering a scene from two perspectives, one for each eye.
As a result, we’ve pursued a different load-balancing method for VR in which one GPU renders the image for the right eye while the other GPU renders the image for the left eye. This method is a natural fit for VR. It uses less buffering than AFR load-balancing and ensures even lower latency.
That said, DirectX 12 should allow developers to use all sorts of creative solutions for multi-GPU operation, including in VR applications. I doubt AFR load-balancing will be part of the mix for VR, but one never knows.
– When AMD taped out Tahiti in 2011, the focus was on peak frame rates. No one seemed to care about stuttering, as long as the cards could post a top FPS that was as competitive as possible. Did it really take until Fiji for AMD’s engineers and software teams to prioritise a ‘consistently smooth experience’?
I think the goal has always been to deliver good gaming experiences. AMD has had a lot of smart people working on that problem for many years.
Most folks in the industry used to believe FPS averages were a good proxy for smooth animation, but as games and graphics APIs grew more complex, gaps between high FPS averages and good user experiences became more apparent. We didn’t always use the right quantitative language to describe these performance issues precisely. I think with the move to thinking about frame times rather than FPS averages, we’ve solved that problem, so we can more easily pinpoint problems like stuttering. This shift in thinking has slowly diffused across the industry in recent years. The push for VR, with its especially strict latency needs, has helped move things along.
That said, if you look back at the work going on at AMD in the Tahiti era, I think you’ll see incredible innovation that has laid the foundation for where the whole industry is now going. Tahiti was the first GPU with Asynchronous Compute Engines, or ACEs, which is the scheduling hardware that enables async compute shaders with concurrent execution. Our competition still doesn’t have that capability. Variants of this same GCN architecture have driven the majority of consumer game consoles. And during the same era, AMD created Mantle, the first modern, low-level graphics API that served as an inspiration for Vulkan and DirectX 12.
All of those pieces are geared toward enabling creamy smooth in-game animation. Fortunately, a lot of them have come together seemingly at once, with the move to new APIs and the renewed focus in the Radeon Technologies Group. Still, the payoff we’re seeing now is the result of many years of good work.
– How much stronger is AMD’s position now, than when you joined?
As you know, the Radeon Technologies Group just recently had its first birthday. Since its beginnings, RTG has brought a renewed focus to our graphics business and, in my view, the right priorities under the leadership of Raja Koduri. I’m happy to be able to play a small part in that much larger picture. Raja has recruited some top talent from outside of the company and elevated smart people within AMD, as well. With the new Polaris products and our DX12 leadership, you’re seeing just a few benefits of these changes.
I’d put the improvements to RTG’s position into two big buckets, hardware and software.
On the software front, we’ve introduced a sharp new user interface in Radeon Software Crimson, and we’ve raised the frequency of new driver releases, so we have fresh drivers ready on day one for major games. We’ve staked out a leading position with DirectX 12 and Vulkan, and we’ve made incremental improvements like the DX12 frame-pacing feature we’re talking about today. There’s a lot more coming on that front that I can’t talk about yet, too.
Meanwhile, we’ve started up our GPUOpen effort and released a bunch of open-source software components that serve as alternatives as to the proprietary software our competition uses to lock people into its chips. I don’t know that people entirely understand the power of the open approach yet, but I think it will become apparent as developers adopt and modify our solutions for in-game visual effects, for VR content creation, and for high-performance computing. These things will transform the way industries use our processors.
On the hardware front, we have a new line of Radeon RX products based on the Polaris 10 and 11 chips. The RX-series Radeons offer major improvements in terms of power efficiency and performance, and they hit the sweet spot in price for most PC gamers. If I were building a gaming rig today, I’d be looking very hard at a Radeon RX 480 8GB and a 2560×1440 FreeSync display. I think that’s the best way to get more gaming smoothness for your money. Our competition wants to charge you a huge premium for a variable-refresh display or a bigger GPU to get a similar effect.
Finally, Raja has talked about our roadmap, which includes insights into new products coming in the future. Meanwhile, our friends on the CPU side of the company have already demonstrated working silicon for Zen. That’s a huge milestone reached. All in all, I think we have much to celebrate on RTG’s first birthday.
KitGuru says: Thanks to Scott for taking the time to answer our questions. What do you think of AMD’s DX12 multi-GPU frame pacing work? Will you be testing out the new feature in your own system? Discuss on our Facebook page HERE.