Mali-T860 GPU headlines new range of integrated media designs from ARM

by: Gary SimsOctober 27, 2014

ARM Announces New Suite of Integrated Media IP

In what is turning into an annual event, ARM has announced its next generation of integrated media processor designs including its new flagship GPU, the Mali-T860. At the end of October 2013 ARM unveiled two new members of its Mali family, the Mali-T720 and the Mali-T760. Now a year on, the company has announced their successors along with a third GPU design plus two other media related processors.

ARM's partners shipped over 400 million chips with Mali GPUs during 2013.

The five new media processors are: the Mali-T820 GPU, the Mali-T830 GPU, the Mali-T860 GPU, the Mali-V550 video decoder, and the Mali DP-550 display processor. And as you would expect, all these designs are faster and offer more functionality than their predecessors, yet while remaining in the tight thermal budget needed for smartphones and tablets.

ARM’s Media Processing Division is a big part of the company’s business. It works with over 60 partners, who between them have 100 Mali licences, to integrate Mali-GPUs and other Mali processors into System-on-a-Chip (SoC) designs along with ARM based CPUs. At the moment Mali is the number one GPU used on Android devices and ARM’s partners shipped over 400 million chips with Mali GPUs during 2013.

ARM Mali-T860


The Mali-T860 builds on the previous generation of Mali-GPUs and contains the same number of shaders as the Mali-T760. However the T860 (and in fact the T820 and T830) incorporate bandwidth reduction technologies such as a transactional elimination, smart composition, ASTC, and pixel local storage. This results in an overall performance increase. According to ARM the Mali-T860 is 45 percent faster than the Mali-T628 when used in the same configuration and manufactured using the same process.

The Mali-T860 also supports native 10-bit YUV input and output. This is important for devices needing high fidelity content for 4K (and greater) displays. YUV is a system for defining colors and is different from the RGB (Red, Green, Blue) system. YUV is used by broadcast TV and defines colors based on luminance and chrominance, i.e. brightness and color. Y is the luminance (brightness) component and the U and V are the chrominance (color) components. By altering the values of Y, U and V each pixel can be defined in terms of brightness, color, and tint.

The Mali-T860 also supports and impressive range of graphic and compute APIs:

  • OpenGL ES 3.1/3.0/2.0/1.1
  • OpenCL 1.2/1.1
  • Microsoft Windows compliant DirectX 11.1
  • RenderScript Compute

ARM Mali-T820-T830

Mali-T820 and Mali-T830

The next two chips in ARM’s new lineup are the Mali-T820 and the Mali-T830. The two GPUs are very similar but with one important difference. Both offer up to four shaders and include the same bandwidth reduction technologies as the Mali-T860. Both can optionally support 10-bit YUV (at the silicon makers discretion) and both support the same set of graphic and compute APIs:

  • OpenGL ES 1.1, 2.0 and 3.1
  • OpenCL 1.1, 1.2
  • DirectX 11 FL9_3
  • RenderScript Compute

Compared to the Mali-T860, the difference in APIs is that the T830/T820 only supports DirectX 11 FL9_3 and not DirectX 11.1. However this is hardly a problem for Android users!

The difference between the Mali-T820 and the Mali-T830 is that the latter has two ALU cores per shader (like the T860) whereas the T820 only has one. In other words, the T860 can scale up to 32 ALU cores, the T830 can handle up to 8 ALU cores, and the T820 is designed for a maximum of 4 ALU cores. According to ARM the T830 is ideal for applications that need a cost effective GPU which includes reasonable GPU compute capabilities.


The Mali-V550, the Mali-DP550, and the software stack

Alongside the new GPUs, ARM has announced a new video decoder and a new display processor. The Mali-V550 is ARM’s first video decoder which includes HEVC (H.265) hardware encoding and decoding in a single core. As well as H.265, the processor can also do hardware decoding and encoding of H.264, MP4, VP8, VC-1, H.263 and Real.

A single core on this little beast can handle full HD (1080p) at 60 frames per second. When built in an eight core configuration, the processor can handle 4K at 120 frames per second. All this comes with full 10-bit YUV support and AFBC bandwidth saving. ARM has also built-in some clever tech that can handle bus latency without dropping frames. What that means is that OEMs can use slower (i.e cheaper) memory subsystems and the video decoder will continue working even when the data isn’t presented to the decoder at the optimum moment.

The new display driver, the Mali-DP550, brings energy efficient processing all the way to the glass! It can handle composition, rotation, scaling, post-processing and display output in a single pass. There is also support for 7 layer compositions and the processor can be scaled up to handle 4K displays. The new display processor also allows OEMs to work directly with the internal display pipeline via a co-processor interface. This allows designers to add new 3rd party enhancements like noise reduction or backlight adjustments without needing to by-pass or circumvent the display processor.

A large part of what ARM offers its partners is not actually hardware designs, but software.

A large part of what ARM offers its partners is not actually hardware designs, but software. It is all very nice having a powerful new SoC with the latest Mali-GPU, but if it doesn’t play well with Android then it is just as much use as a vacuum cleaner in an operating room. Each SoC needs an optimized driver stack which sits between the high level Android system calls and the hardware. Since this hardware is made up of a GPU, a video driver and a display driver, then the driver stack needs to be able to make intelligent decisions about where to offload certain tasks. There is also the interaction with the various Linux kernel modules and the memory subsystem.

By providing an integrated software stack, ARM saves OEMs lots of time and money in the development of drivers for its SoCs, plus it ensures that the drivers are fully optimized and offer the best power efficiency.

Who and when?

The designs for the various processors is already with ARM’s partners. These new processors work equally well with ARM’s 32-bit Cortex CPU designs (e.g. Cortex-15, Cortex-A17, Cortex-7) and with its 64-bit Cortex CPU designs (e.g. Cortex-A53 and Cortex-A57). ARM anticipates that we will see silicon with the new GPUs sometime around the middle of 2015 and devices should start to appear towards the end of 2015 and the beginning of 2016.

  • Richard

    Note 5 Exynos with Intel LTE and Mali T860 in the USA I can hope.

    • Anonymousfella

      Sounds exciting.Exynos 7 64bit with integreted LTE and either a Mali T-860 or a PowerVR 6 series GPU

    • 1234

      Intel LTE. Dream biatch. Qualcomm is leader in LTE

  • renz

    no full directx 12.0 support?

  • ConCal

    I love these in depth articles about upcoming tech/hardware! Keep it up!

  • Anonymousfella

    Is the T-860 faster than the GPU in Nvidia’s K1? Any idea?

    • Hippotatomus

      The t760mp16 is around 50% more powerful than gx1 in tegra k1.

      So the t860 will definitely be more powerful.

    • renz

      definitely faster. i heard even T760 in 16 core configuration will be faster than K1. but judging from the expected time T860 will hit consumer product T860 will most likely will go head to head with nvidia erista.

  • Roberto Tomás

    “This results in an overall performance increase. According to ARM the Mali-T860 is 45 percent faster than the Mali-T628 when used in the same configuration and manufactured using the same process.” it sounds like the big improvement is really just as much the APIs then, because the T760 was already very much faster ( about ~20% per core) than the T628.