Ashes of the Singularity Revisited: A Beta Look at DirectX 12 & Asynchronous Shading
by Daniel Williams & Ryan Smith on February 24, 2016 1:00 PM ESTDirectX 12 Multi-GPU Performance
Shifting gears, let’s take a look at multi-GPU performance on the latest Ashes beta. The focus of our previous article, Ashes’ support for DX12 explicit multi-GPU makes it the first game to support the ability to pair up RTG and NVIDIA GPUs in an AFR setup. Like traditional same-vendor AFR configurations, Ashes’ AFR setup works best when both GPUs are similar in performance, so although this technology does allow for some unusual cross-vendor comparisons, it does not (yet) benefit from pairing up GPUs that widely differ in performance, such as a last-generation video card with a current-generation video card. None the less, running a Radeon and a GeForce card together is an interesting sight, if only for the sheer audacity of it.
Meanwhile as a result of the significant performance optimizations between the last beta build and this latest build, this has also had an equally significant knock-on effect on mutli-GPU performance as compared to the last time we looked at the game.
Even at 4K a pair of GPUs ends up being almost too much at Ashes’ High quality setting. All four multi-GPU configurations are over 60fps, with the fastest Fury X + 980 Ti configuration nudging past 70fps. Meanwhile the lead over our two fastest single-GPU configurations is not especially great, particularly compared to the Fury X, with the Fury X + 980 Ti configuration only coming in 15fps (27%) faster than a single GPU. The all-NVIDIA comparison does fare better in this regard, but only because of GTX 980 Ti’s lower initial performance.
Digging deeper, what we find is that even at 4K we’re actually CPU limited according to the benchmark data. Across all four multi-GPU configurations, our hex-core overclocked Core i7-4960X can only setup frames at roughly 70fps, versus 100fps+ for a single-GPU configuration.
Top: Fury X. Bottom: Fury X + 980 Ti
The increased CPU load from utilizing multi-GPU is to be expected, as the CPU now needs to spend time synchronizing the GPUs and waiting on them to transfer data between each other. However dropping to 70fps means that Ashes has become a surprisingly heavy CPU test as well, and that 4K at high quality alone isn’t enough to max out our dual GPU configurations.
Cranking up the quality setting to Extreme finally gives our dual-GPU configurations enough of a workload to back off from the CPU performance cap. Once again the fastest configuration is the Fury X + 980 Ti, which lands just short of 60fps, followed by the Fury X + Fury configuration at 55.1fps. In our first look at Ashes multi-GPU scaling we found that having a Fury X card as the lead card resulted in better performance, and this has not changed for the newest beta. The Fury continues to be faster at reading data off of other cards. Still, the gap between the Fury X + 980 Ti configuration and the 980 Ti + Fury X configuration has closed some as compared to last time, and now stands at 11%.
Backing off from the CPU limit has also put the multi-GPU configurations well ahead of the single-GPU configurations. We’re now looking at upwards of a 65% performance boost versus a single GTX 980, and a smaller 31% performance boost versus a single Fury X. These are smaller gains for multi-GPU configurations than we first saw last year, but it’s also very much a consequence of Ashes’ improved performance across the board. Though we didn’t have time to test it, Ashes does have one higher quality setting – Crazy – which may drive a bit of a larger wedge between the multi-GPU configurations and the Fury X, though the overhead of synchronization will always present a roadblock.
153 Comments
View All Comments
Beany2013 - Wednesday, February 24, 2016 - link
You are aware that Mantle and DX12 are actually different APIs, yeah?zheega - Wednesday, February 24, 2016 - link
AMD just released new drivers that say are made for this benchmark. Can we get a quick follow-up if their performance improves even more??http://support.amd.com/en-us/kb-articles/Pages/AMD...
AMD has partnered with Stardock in association with Oxide to bring gamers Ashes of the Singularity – Benchmark 2.0 the first benchmark to release with DirectX® 12 benchmarking capabilities such as Asynchronous Compute, multi-GPU and multi-threaded command buffer Re-ordering. Radeon Software Crimson Edition 16.2 is optimized to support this exciting new release.
revanchrist - Wednesday, February 24, 2016 - link
See? Every time when there's a pro AMD game tested, there'll be much butt hurt fanboy comments. And i guess everyone knows why. Because when you bought something, you'll always want to justified your purchase and you know who's got the lion share of the dGPU market now. Guess nowadays people are just too sensitive or has a heart of glasses, which makes them judging things ever so subjectively and personally.Socius - Wednesday, February 24, 2016 - link
For anyone who missed it:"Update 02/24: NVIDIA sent a note over this afternoon letting us know that asynchornous shading is not enabled in their current drivers, hence the performance we are seeing here. Unfortunately they are not providing an ETA for when this feature will be enabled."
ToTTenTranz - Wednesday, February 24, 2016 - link
"Unfortunately they are not providing an ETA for when this feature will be enabled."If ever...
andrewaggb - Wednesday, February 24, 2016 - link
Makes sense why it would be slightly slower. Also makes through benchmarks less meaningfulExt3h - Wednesday, February 24, 2016 - link
"not enabled" is a strange and misleading wording, since it obviously is both available and working correctly according to the specification.Should be read as "not being made full use of", as it is only lacking any clever way of profiting from asynchronous compute in hardware.
barn25 - Thursday, February 25, 2016 - link
If you google around you will find out nvidia does not have asynchornous shading on its DX"12" cards. this was actually first found out in WDDM 1.3 back in windows 8.1 when they would not support the optional features which AMD does.Ext3h - Thursday, February 25, 2016 - link
I know that the wrong terminology kept being used for years now, especially driven by major tech review websites like this one. But that's still not making it any less wrong.The API is fully functional. So the driver does support it. Whether it does so efficiently is an entirely different matter, you don't NEED hardware "support" to provide that feature. Hardware support is only required to provide parallel execution, as opposed to the default sequential fallback. The latter one is perfectly within the bounds in the specification, and counts as fully functional. It's just not providing any additional benefits, but it's neither broken nor deactivated.
barn25 - Thursday, February 25, 2016 - link
Don't try to change it. I am referring to HW Asyc compute, which AMD supports and NVidia does not. Using a shim will impact performance even greater.