ARM Announces Mali-G51 Mainstream GPU, Mali-V-61 Video Processing Block
by Ryan Smith on October 31, 2016 9:00 PM ESTThese days ARM and its customers are in the midst of a major evolution in GPU design. Back in May the company announced their new Bifrost GPU architecture, a new and modern architecture for future GPUs. With Bifrost ARM would be taking a leap that we’ve seen many other GPU vendors follow over the years, replacing an Instruction Level Parallelism (ILP)-centric GPU design with a modern, scalar, thread level parallelism (TLP)-centric design that’s a better fit modern workloads.
The first of these new Bifrost GPUs was introduced at the same time, and that was Mali-G71. However as our regular readers likely know, ARM doesn’t stop with just a single GPU design; rather they have multiple designs for their partners to use, running the gamut from high performance cores to area efficient cores. Mali-G71 was the former, and now this week ARM is introducing the latter with the release of the Mali-G51 design.
If Mali-G71 was the successor to the Mali-T880, then Mali-G51 is the successor to the Mali-T820 & T830. That is to say, it’s a mainstream part that has been optimized for performance within a given area – when SoC space and/or cost is at a premium – as opposed to G71’s greater total throughput. Broadly speaking, mainstream parts like Mali-G51 end up in equally mainstream SoCs like the Exynos 7870 (Galaxy A-series), as opposed to flagship-level SoCs like the Exynos 8890 (Galaxy S7). And along those lines, somewhat surprisingly, ARM is rather keen on talking about the VR market in conjunction with G51, even though it’s not their high-performance GPU design. Even G51, they’re confident, can offer good VR performance for the kinds of admittedly simpler workloads they have in mind.
Meanwhile at a technical level, rather than just being a cut-down version of Mali-G71, Mali-G51 is an interesting GPU design in its own right. ARM has opted to go with a continuous development cycle for the Mali-G series, which means that each GPU is in essence branched off of the ongoing Mali design process when a new design is needed. That means besides market-specific optimizations, successive GPUs can contain features not found in earlier GPUs under the same brand, and that’s definitely the case for G51.
So what sets G51 apart from G71? From the area efficiency perspective, the big change here is that ARM has reworked the shader cores to offer what they call a “dual pixel” design, as opposed to G71’s “single pixel’ design. In brief, per a G71 shader core could process 24 FLOPS (12 FMAs) over its three execution engines, while its texture and blending units could process 1 texel and 1 pixel respective. G51, by contrast, has adjusted the throughput ratio to more heavily favor pixel/texel throughput; a G51 shader core has the same 24 FLOPS throughput, but couples that with 2 texels and 2 pixels per clock. ARM did something similar in previous Mali Midgard generations – varying the number of ALUs – and the reason to do so is fairly straightforward, as advanced graphical effects are traditionally more shader-heavy than pixel-heavy. The end result being that for simpler workloads such as application UIs, the need for the shader throughput tends to scale down more rapidly in the mobile space.
ARM Mali G Series | ||||
Mali-G71 | Mali-G51 | |||
Role | High Performance | Area Efficient | ||
Core Configurations | 4-32 | N/A | ||
ALU Lanes Per Core (Default) | 12 | 12 | ||
Texture Units Per Core | 1 | 2 | ||
Pixel Units Per Core | 1 | 2 | ||
FLOPS:Pixel Ratio | 24:1 | 12:1 | ||
APIs | OpenGLES 3.2 OpenCL 2.0 Vulkan |
OpenGLES 3.2 OpenCL 2.0 Vulkan |
And while the dual pixel core is the biggest change for G51, it’s not the only change. By being based on a newer iteration of Bifrost, it includes a few notable, low-level tweaks to improve performance. Transcendental performance has been significantly improved; it turns out those operations are still used more often than ARM expected, G51 bakes in better support to maintain higher performance. There are also some outright new instructions on G51, and ARM’s framebuffer compression technology has been improved as well. Version 1.2 of AFBC implements some optimizations for better memory traffic shaping and burst lengths, as well as an improvement for constant color blocks.
Overall, ARM is touting that G51 offers significant improvements to performance, density, and energy efficiency relative to the Mali-T830. On equal processes, G51 a mix of 30% smaller than T830, 60% better performance per mm2, and 60% higher performance per watt. I’m told area efficiency was the primary design in the goal, making the latter a pleasant surprise of sorts.
Finally, like ARM’s other GPU IP announcements, this week’s announcement is about making the technology available to the company’s partners for implementation, rather than being a consumer-oriented announcement. ARM’s partners are already looking at early versions of the G51 design, and based on typical product development cycles, G51 should be showing up in devices in 2018.
Mali-V61
Meanwhile on a quick note, alongside the Mali-G51 GPU, ARM is also announcing the Mali-V61 video processor. This is the product formerly known as Egil, which ARM unveiled back in June while it was still under development. Now, along with G51, V61 is being released to ARM’s partners as well.
V61/Egil has not significantly changed since we’ve last seen it. ARM’s fully modernized video encode and decode block follows a who’s who list of codecs and features, supporting 10-bit HEVC encode/decode and 10-bit VP9 encode/decode. Relative to the VP550 before it, ARM’s latest video processor supports a wider range of codecs, and now, having a full-feature HEVC encoder implementation, offers much better HEVC compression as well.
Ultimately ARM is looking to sell Mali-V61 alongside Mali-G51 and their DP650 display process as a complete graphics solution to partners, which they call the Mali Multimedia Suite (though it can be used stand-along as well). And like Mali-G51, expect to see Mali-V61 start showing up in devices around a year from now.
23 Comments
View All Comments
tuxRoller - Sunday, November 6, 2016 - link
Vp9 is good enough, and vc1 shouldn't be much longer.webdoctors - Tuesday, November 1, 2016 - link
Can someone translate this chip into English? Will it run Crysis?darkich - Wednesday, November 2, 2016 - link
Well, it could probably run it on low and in 720p.It's about as powerful as last generation consoles