ARM Announces Mali-G51 Mainstream GPU, Mali-V-61 Video Processing Block
by Ryan Smith on October 31, 2016 9:00 PM ESTThese days ARM and its customers are in the midst of a major evolution in GPU design. Back in May the company announced their new Bifrost GPU architecture, a new and modern architecture for future GPUs. With Bifrost ARM would be taking a leap that we’ve seen many other GPU vendors follow over the years, replacing an Instruction Level Parallelism (ILP)-centric GPU design with a modern, scalar, thread level parallelism (TLP)-centric design that’s a better fit modern workloads.
The first of these new Bifrost GPUs was introduced at the same time, and that was Mali-G71. However as our regular readers likely know, ARM doesn’t stop with just a single GPU design; rather they have multiple designs for their partners to use, running the gamut from high performance cores to area efficient cores. Mali-G71 was the former, and now this week ARM is introducing the latter with the release of the Mali-G51 design.
If Mali-G71 was the successor to the Mali-T880, then Mali-G51 is the successor to the Mali-T820 & T830. That is to say, it’s a mainstream part that has been optimized for performance within a given area – when SoC space and/or cost is at a premium – as opposed to G71’s greater total throughput. Broadly speaking, mainstream parts like Mali-G51 end up in equally mainstream SoCs like the Exynos 7870 (Galaxy A-series), as opposed to flagship-level SoCs like the Exynos 8890 (Galaxy S7). And along those lines, somewhat surprisingly, ARM is rather keen on talking about the VR market in conjunction with G51, even though it’s not their high-performance GPU design. Even G51, they’re confident, can offer good VR performance for the kinds of admittedly simpler workloads they have in mind.
Meanwhile at a technical level, rather than just being a cut-down version of Mali-G71, Mali-G51 is an interesting GPU design in its own right. ARM has opted to go with a continuous development cycle for the Mali-G series, which means that each GPU is in essence branched off of the ongoing Mali design process when a new design is needed. That means besides market-specific optimizations, successive GPUs can contain features not found in earlier GPUs under the same brand, and that’s definitely the case for G51.
So what sets G51 apart from G71? From the area efficiency perspective, the big change here is that ARM has reworked the shader cores to offer what they call a “dual pixel” design, as opposed to G71’s “single pixel’ design. In brief, per a G71 shader core could process 24 FLOPS (12 FMAs) over its three execution engines, while its texture and blending units could process 1 texel and 1 pixel respective. G51, by contrast, has adjusted the throughput ratio to more heavily favor pixel/texel throughput; a G51 shader core has the same 24 FLOPS throughput, but couples that with 2 texels and 2 pixels per clock. ARM did something similar in previous Mali Midgard generations – varying the number of ALUs – and the reason to do so is fairly straightforward, as advanced graphical effects are traditionally more shader-heavy than pixel-heavy. The end result being that for simpler workloads such as application UIs, the need for the shader throughput tends to scale down more rapidly in the mobile space.
ARM Mali G Series | ||||
Mali-G71 | Mali-G51 | |||
Role | High Performance | Area Efficient | ||
Core Configurations | 4-32 | N/A | ||
ALU Lanes Per Core (Default) | 12 | 12 | ||
Texture Units Per Core | 1 | 2 | ||
Pixel Units Per Core | 1 | 2 | ||
FLOPS:Pixel Ratio | 24:1 | 12:1 | ||
APIs | OpenGLES 3.2 OpenCL 2.0 Vulkan |
OpenGLES 3.2 OpenCL 2.0 Vulkan |
And while the dual pixel core is the biggest change for G51, it’s not the only change. By being based on a newer iteration of Bifrost, it includes a few notable, low-level tweaks to improve performance. Transcendental performance has been significantly improved; it turns out those operations are still used more often than ARM expected, G51 bakes in better support to maintain higher performance. There are also some outright new instructions on G51, and ARM’s framebuffer compression technology has been improved as well. Version 1.2 of AFBC implements some optimizations for better memory traffic shaping and burst lengths, as well as an improvement for constant color blocks.
Overall, ARM is touting that G51 offers significant improvements to performance, density, and energy efficiency relative to the Mali-T830. On equal processes, G51 a mix of 30% smaller than T830, 60% better performance per mm2, and 60% higher performance per watt. I’m told area efficiency was the primary design in the goal, making the latter a pleasant surprise of sorts.
Finally, like ARM’s other GPU IP announcements, this week’s announcement is about making the technology available to the company’s partners for implementation, rather than being a consumer-oriented announcement. ARM’s partners are already looking at early versions of the G51 design, and based on typical product development cycles, G51 should be showing up in devices in 2018.
Mali-V61
Meanwhile on a quick note, alongside the Mali-G51 GPU, ARM is also announcing the Mali-V61 video processor. This is the product formerly known as Egil, which ARM unveiled back in June while it was still under development. Now, along with G51, V61 is being released to ARM’s partners as well.
V61/Egil has not significantly changed since we’ve last seen it. ARM’s fully modernized video encode and decode block follows a who’s who list of codecs and features, supporting 10-bit HEVC encode/decode and 10-bit VP9 encode/decode. Relative to the VP550 before it, ARM’s latest video processor supports a wider range of codecs, and now, having a full-feature HEVC encoder implementation, offers much better HEVC compression as well.
Ultimately ARM is looking to sell Mali-V61 alongside Mali-G51 and their DP650 display process as a complete graphics solution to partners, which they call the Mali Multimedia Suite (though it can be used stand-along as well). And like Mali-G51, expect to see Mali-V61 start showing up in devices around a year from now.
23 Comments
View All Comments
Meteor2 - Tuesday, November 1, 2016 - link
'VP9 achieving similar quality to HEVC' -- that's a bold statement. Or perhaps ARM means their implementation of the codecs?!tuxRoller - Monday, November 7, 2016 - link
If memory serves, at the big codec conference a few months back, Netflix released data indicating that hevc averaged 20% efficiency over vp9 BUT that was mostly in the lower res region. So, since Netflix could just deploy vp9 and accept similar results, at low res, to h264 and close results to h265 at high res.It's all on the YouTubes if you want to watch the presentation.
rrohbeck - Tuesday, November 1, 2016 - link
Will there be open drivers?karthik.hegde - Tuesday, November 1, 2016 - link
Does it continue with the Full system coherency that G71 offered?fanofanand - Tuesday, November 1, 2016 - link
I find it interesting how differently some companies view minimum VR specs vs other companies. GTX 970 is the minimum for Rift, right? The GTX 970 should be far superior to the G51 by orders of magnitude, yet the G51 is designed for VR? I'm sure there is a VR consortium, they need to get their act together and get some standards in place here. Between HDR and VR lack of standardization is harming the ability to gain momentum towards mass adoption.Ariknowsbest - Tuesday, November 1, 2016 - link
They should call it moblieVR for the cheap allinone headsets with PowerVR SGX544 or mali-400/450. The G51 would still be far superior to these solutions, and it would probably work well with Daydream.zodiacfml - Tuesday, November 1, 2016 - link
Dedicated digital cameras will salivate to this kind of processing power. I wonder why even the most expensive digital cameras can't have SoCs of smartphones.Ariknowsbest - Tuesday, November 1, 2016 - link
Probably power constraints, a controller and ASIC is more efficient than a gpu.SydneyBlue120d - Tuesday, November 1, 2016 - link
And still, no smartphone will ever use the HEVC encoder ever.Ryan Smith - Tuesday, November 1, 2016 - link
And thus VP9...