One of the key CPU core designs for 2016 (and beyond) is the Cortex-A72. Designed by ARM, it was announced at the beginning of 2015 and during the summer I got a chance to talk with the lead designer, Mike Filippo. Robert Triggs also wrote a deeper analysis of the A72’s core architecture. The Cortex-A72 is ARM’s second generation 64-bit core design and ARM wanted to achieve three main goals with the design:
- Push the performance up for the next generation of phones and mobile products.
- Pull the power down significantly so that it can sustain maximum frequency performance for longer.
- Reduce the area of the design, which contributes to the reduction in power, but also enables low cost designs as well.
As with many industries, going from design to product is a long process and now at the beginning of 2016 we are starting to see the first smartphones with
System-on-a-Chips (SoCs) using the Cortex-A72. One of the first is the Huawei Mate 8 with its Kirin 950 processor.
The Kirin 950 is an octa-core processor that includes 4 Cortex-A72 cores, clocked at 2.3GHz, four Cortex-A53 cores, clock at 1.8GHz, an ARM Mali T880 GPU and Huawei’s i5 co-processor. It is build on a 16nm FinFET+ process node and is said to be 30% more efficient than the Kirin 930. According to Huawei this means that the CPU uses at least 20% lower power and has 11% higher performance than ARM’s previous generation of core design.
As for the GPU, the Mali T880 is ARM’s latest generation of GPU which offers up to 1.8x the performance of the 2014 Mali T760 GPU, while boasting up to a 40% energy reduction. As well as the CPU and GPU, the Kirin 950 also includes the i5 co-processor. It supports all the functions of a sensor hub as well as speech recognition, MP3 playback, and Fused Location Provider (FLP) navigation.
So this is all great in terms of theory, ARM designed a faster, more efficient CPU core and Huawei turned that design into a faster, more power efficient chip. But what about the real world? How does it perform?
I recently got my hands on a Huawei Mate 8 and I have been running a large variety of tests on the phone to see what kind of performance levels this latest generation of SoC can deliver.
The standard benchmarks
Here is a table of the CPU focused benchmarks, alongside the scores for the Exynos 7420 (as found in the Note 5) and the Snapdragon 810 (as found in the Sony Z5 Compact):
|AnTuTu||CPU Prime Benchmark||Geekbench|
CPU Prime Benchmark:31108
CPU Prime Benchmark:22862
CPU Prime Benchmark:20771
As we can see the Cortex-A72 in the Kirin 950 performs excellently. The AnTuTu, CPU Prime Benchmark and Geekbench scores are all higher than the Exynos 7420 and the Snapdragon 810, both of which have Cortex-A57 cores. Of particular interest is the increase in the single-core performance scores from Geekbench.
But what about the GPU, do we see similar gains? Here is a table of the GPU test results, along with the comparison results:
|Epic Citadel||3DMark - Sling Shot (using ES 3.1)||3DMark - Ice Storm Unlimited (ES 2.0)|
Epic Citadel:59 fps at 1800 x 1080 in Ultra High Qualiry mode.
3DMark - Sling Shot (using ES 3.1):923
3DMark - Ice Storm Unlimited (ES 2.0):19026
Epic Citadel:49.2 fps at 2560 x 1440 in Ultra High Qualiry mode.
3DMark - Sling Shot (using ES 3.1):1278
3DMark - Ice Storm Unlimited (ES 2.0):25073
Epic Citadel:58.5 fps at 1200 x 720 in Ultra High Qualiry mode.
3DMark - Sling Shot (using ES 3.1):1168
3DMark - Ice Storm Unlimited (ES 2.0):27160
So while the CPU part of the Kirin 950 is clearly leading the way, it seems that the GPU is actually slightly behind. I don’t know if this is a software optimization issue, an implementation issue that is particular to the Kirin 950, but I was expecting more from the Mali T880.
More like the real world
|Kraken (lower is better)||Google Octane|
Kraken (lower is better):3524
Kraken (lower is better):3753
Kraken (lower is better):4253
Like the CPU tests earlier, here we can yet again see the improvements that the Cortex-A72 brings when compared to the Cortex-A57. The Mate 8 is faster for both Kraken and Octane when compared to the Cortex-A57 based processors.
To make sure that everything is fair, I have also written my own benchmarks. I use these mainly to check that the results I am getting from the popular testing apps are genuine. The first of my custom benchmarks tests the CPU without using the GPU. It is a four stage test that first calculates 100 SHA1 hashes on 4K of data, then it performs a large bubble sort on an array of 9000 items. Thirdly, it shuffles a large table one million times, and lastly it calculates the first 10 million primes. The total time needed to do all those things is displayed at the end of the test run. The results are below in the “Hashes, bubble sorts, tables and primes” column. Note that lower is better for this test.
The second of my three custom benchmarks uses a 2D physics engine to simulate water being poured into a container. The idea here is that while the GPU will be used slightly for the 2D graphics, most of the work will be carried out by the CPU. The complexity of so many droplets of water will exercise the CPU. One drop of water is added every frame and the app is designed to run at 60 frames per second. The benchmark measures how many droplets are actually processed and how many are missed. The maximum score is 5400.
My third benchmark is written in Unity3D. It is a terrain flyover that yields a frame per second score for a pre-programmed pass over the rendered world.
|Hashes, bubble sorts, tables and primes (lower is better)||Water simulation (best score is 5400)||Terrain 4|
Hashes, bubble sorts, tables and primes (lower is better):19074
Water simulation (best score is 5400):5400
Terrain 4:3543 total frames, 22.83
Hashes, bubble sorts, tables and primes (lower is better):30370
Water simulation (best score is 5400):5349
Terrain 4:3432 total frames, 21.48 fps
Hashes, bubble sorts, tables and primes (lower is better):22937
Water simulation (best score is 5400):5222
Terrain 4:4800 total frames, 42.22 fps
As we can see the Kirin 950 performs better than the other two devices for the hashes etc test. In fact the Kirin 950 is 37% faster than the Exynos 7420 in this particular test. The Note 5 held the record for my water simulation benchmark, until the Mate 8 came along. The Exynos 7420 scores 5359, just slightly shy of the maximum score, however the Mate 8 hits the jackpot. This is great news for Huawei, however it is terrible news for me, as it means I will need to re-write the benchmark for 2016’s flag ship devices!
As for the Unity3D test, the Sony Z5 Compact comes out top due to its 720p screen resolution. It is followed by the Mate 8 and then the Note 5. However it is worth noting that the Mate 8 has a screen resolution of 1920 x 1080 which is lower than the Note 5’s 2560 x 1440. This means that if the Kirin 950 was driving a display akin to the Note 5’s display then it would be slower than the Note 5 overall.
So what does this all mean? Firstly we can see that the CPU part of the Kirin 950 has pushed the performance envelope to new heights and clearly the Cortex-A72 is a significant improvement over the Cortex-A57. However the Kirin 950 seems to be weaker than expected on the GPU side. We won’t know if this is a software optimization issue, or an implementation issue until either Huawei releases some software updates for the Mate 8, or we see other SoCs using the Mali-T880 but with better performance.
Overall it is safe to say that the next generation of mobile SoCs are upon us and that they are faster, leaner and more efficient!