Ashes of the Singularity Revisited: A Beta Look at DirectX 12 & Asynchronous Shading
by Daniel Williams & Ryan Smith on February 24, 2016 1:00 PM ESTDirectX 12 Multi-GPU Performance
Shifting gears, let’s take a look at multi-GPU performance on the latest Ashes beta. The focus of our previous article, Ashes’ support for DX12 explicit multi-GPU makes it the first game to support the ability to pair up RTG and NVIDIA GPUs in an AFR setup. Like traditional same-vendor AFR configurations, Ashes’ AFR setup works best when both GPUs are similar in performance, so although this technology does allow for some unusual cross-vendor comparisons, it does not (yet) benefit from pairing up GPUs that widely differ in performance, such as a last-generation video card with a current-generation video card. None the less, running a Radeon and a GeForce card together is an interesting sight, if only for the sheer audacity of it.
Meanwhile as a result of the significant performance optimizations between the last beta build and this latest build, this has also had an equally significant knock-on effect on mutli-GPU performance as compared to the last time we looked at the game.
Even at 4K a pair of GPUs ends up being almost too much at Ashes’ High quality setting. All four multi-GPU configurations are over 60fps, with the fastest Fury X + 980 Ti configuration nudging past 70fps. Meanwhile the lead over our two fastest single-GPU configurations is not especially great, particularly compared to the Fury X, with the Fury X + 980 Ti configuration only coming in 15fps (27%) faster than a single GPU. The all-NVIDIA comparison does fare better in this regard, but only because of GTX 980 Ti’s lower initial performance.
Digging deeper, what we find is that even at 4K we’re actually CPU limited according to the benchmark data. Across all four multi-GPU configurations, our hex-core overclocked Core i7-4960X can only setup frames at roughly 70fps, versus 100fps+ for a single-GPU configuration.
Top: Fury X. Bottom: Fury X + 980 Ti
The increased CPU load from utilizing multi-GPU is to be expected, as the CPU now needs to spend time synchronizing the GPUs and waiting on them to transfer data between each other. However dropping to 70fps means that Ashes has become a surprisingly heavy CPU test as well, and that 4K at high quality alone isn’t enough to max out our dual GPU configurations.
Cranking up the quality setting to Extreme finally gives our dual-GPU configurations enough of a workload to back off from the CPU performance cap. Once again the fastest configuration is the Fury X + 980 Ti, which lands just short of 60fps, followed by the Fury X + Fury configuration at 55.1fps. In our first look at Ashes multi-GPU scaling we found that having a Fury X card as the lead card resulted in better performance, and this has not changed for the newest beta. The Fury continues to be faster at reading data off of other cards. Still, the gap between the Fury X + 980 Ti configuration and the 980 Ti + Fury X configuration has closed some as compared to last time, and now stands at 11%.
Backing off from the CPU limit has also put the multi-GPU configurations well ahead of the single-GPU configurations. We’re now looking at upwards of a 65% performance boost versus a single GTX 980, and a smaller 31% performance boost versus a single Fury X. These are smaller gains for multi-GPU configurations than we first saw last year, but it’s also very much a consequence of Ashes’ improved performance across the board. Though we didn’t have time to test it, Ashes does have one higher quality setting – Crazy – which may drive a bit of a larger wedge between the multi-GPU configurations and the Fury X, though the overhead of synchronization will always present a roadblock.
153 Comments
View All Comments
BurntMyBacon - Thursday, February 25, 2016 - link
@anubis44: "nVidia wasn't expecting AMD to force Microsoft's hand and release DX12 so soon."I do believe you are correct. Given the lack of ability to throw driver optimizations at the DX12 code path and nVidia's proficiency at doing it, I'd say this will be quite damaging. They've lost one clear advantage they held (at least in DX11).
@anubis44: "It's beginning to look like nVidia's been check-mated by AMD here."
I wouldn't go that far. They probably won't have the necessary hardware in Pascal, but you can be sure Volta will have what it needs. Besides, most games will likely have a DX11 code path for the foreseeable future as developers wouldn't want to lock themselves out of an entire market. Also, at the moment, nVidia can still play DX12 fine, they just don't appear to have the advantage at the moment given the small sample set of available data points.
In conclusion, it is more like they have lost a rook or queen. Of course, they've taken a few of ATi's pieces as well, so lets just wait and see who plays their remaining pieces better.
rhysiam - Thursday, February 25, 2016 - link
The other thing I would add to this is that it's not like Nvidia have nowhere to go here. Take the GTX 970 vs the R9 390 for example... they're in a similar price & performance tier. Yet the 970 is smaller with fewer transistors (usually meaning it's cheaper to produce) and generally has a much higher overclocking headroom (because Nvidia wasn't under pressure to clock the card closer to the limit to reach relevant performance). So it's reasonable to expect Nvidia could both lower the price and clock it higher to get a significantly better value card with minimal basically no substantive engineering/architectural changes.I'm not suggesting Nvidia will do that with the 970 specifically. Rather, what I'm saying is that if they find Pascal is similarly behind AMD they've got plenty of room to tweak performance and price before we can start calling them "check-mated". But it's certainly good new for us if DX12 performance like this continues and AMD essentially forces Nvidia to lower its margin.
CiccioB - Sunday, February 28, 2016 - link
They can do exactly as AMD has done with GCN: they just can start using 30 or 50% bigger GPUs to close the performance gap if they really need to.The_Countess - Thursday, February 25, 2016 - link
nvidia's entire performance advantage in DX11 is based on game specific driver optimizations. they have a virtual army of developers slaving away on those (and coming up with way to hurt everyone's performance as long as it hurt AMD the most or makes their own latest gen cards look better... but that's a different matter)with DX12 however the drivers becomes MUCH thinner and doesn't have nearly as much influence. so basically nvidia's main competitive advantage is gone with dx12 and vulkan.
as for being relevant: this year pretty much every game where performance matters will have either a DX12 or Vulkan render option. adding in the fact that AMD cards generally age better then nvidia's (those game specific optimizations focus pretty much exclusively only on their latest generation of cards) and i would say that yes it is very relevant.
BurntMyBacon - Thursday, February 25, 2016 - link
@The_Countess: "nvidia's entire performance advantage in DX11 is based on game specific driver optimizations. they have a virtual army of developers slaving away on those ..."True, they have lost a large advantage. Keep in mind, though, that nVidia's developer relations are still in play. What they once achieved through the use of driver optimizations may still be accomplished through code path optimization and design guidance for nVidia architecture. The first beta for Vulkan (The Talos Principle) showed that merely replacing a high level API (OpenGL/DX11) with a low level one (Vulkan/DX12) does not automatically improve the experience. If nVidia can convince developers to avoid certain non-optimal features or program in such a way as to take better advantage of nVidia hardware in their titles (for the sake of performance on the majority of discrete card owners out there of course) then ATi will be in the same position as they are now. Better hardware, worse software support. Then again, low level API cross-platform titles will most assuredly program to take advantage of the console architectures which happens to be ATi's at the moment.
nevcairiel - Wednesday, February 24, 2016 - link
Considering the Fury X just has a tad bit more raw power than a (older) 980Ti, I would say the DX12 numbers are fine, and what is really showing is AMDs lack of performance in DX11?tuxRoller - Wednesday, February 24, 2016 - link
I don't agree with this. I think this is more a case of nvidia not being able to rely so much on the ENORMOUS number of special cases in their driver.IOW, this is about two things: hardware and game design. The drivers are trivial next to d3d11/ogl.
jasonelmore - Wednesday, February 24, 2016 - link
Fury X's Architecture is much newer than Maxwell 2's. Lets see what the true DX12 cards can do this summer.tuxRoller - Wednesday, February 24, 2016 - link
Did you not notice the across the board improvements for all gcn cards?The point I was making, and that others have made for sometime, is that AMD makes really good hardware but this is typically masked by poor drivers.
You can see this by looking at their excellent performance in compute workloads where the code in the driver is more recent and doesn't have the legacy cruft of their d3d/ogl code.
Despoiler - Thursday, February 25, 2016 - link
It's not their drivers. It's purely architectural. GCN moved their schedulers into to hardware. GCN requires the API to be able to feed it enough work. What people have been calling "driver overhead" is nothing of the sort. DX11 is just not capable of fully utilizing AMD hardware. DX12 is and that is why AMD created Mantle. It forced MS to create DX12 and that set off the creation of Vulkan. All of the next gen APIs are tailored to exploit AMDs already being sold hardware.