The Cortex-A72 was announced back in February, promising another boost to performance and substantial energy savings to boot. At ARM’s TechDay 2015 in London this week, we were fortunate enough to be given some deeper insight into the inner workings of ARM’s latest application processor.
Although the base-line architecture is very similar to the Cortex-A57, the A72 is much more than typical revision. A team of some 65 to 70 engineers have gone back through the design, optimizing almost every logical block for power efficiency, helping the processor to sustain maximum frequencies during heavy workloads, and focused on squeezing the design into a smaller area, to keep costs down.
Architecturally, the Cortex-A72 features a new branch-predictor, increases the effective decode and dispatch bandwidths, and has had changes made to the execution units, to name just a few alterations. ARMs new branch predictor reduces misprediction with a new algorithm and can suppress superfluous branch predictor accesses, which helps to reduce wasted energy. The rebuild offers up to 20 percent improvements to prediction over the A57.
The design still features a 3-wide decode, but the dispatch unit has gone from 3- to 5-wide, to more effectively break operations down into further micro-ops which help keep the 8-wide issue machine well fed. The execution stage sees the introduction of next-gen floating-point SIMD units with a variety of latency reductions, multiple zero-cycle forwarding datapaths to reduce wasted cycles, and substantial bandwidth increases in the two integer units. The load and store units have a more sophisticated combined L1/L2 data prefetcher, offering a bandwidth improvement of 30 percent. All of which, among other changes, is designed to help reduce power consumption and to improve performance in certain areas over the A57.
In terms of what this means for silicon designers and end users, the Cortex-A72 is still a high-end processor, but it will utilize energy more efficiently. In other words, the CPU will be able to do more within the limited power budgets available on mobile and should result in cooler devices as well. Even at 28nm, the Cortex-A72 boasts up to a 50 percent energy reduction when compared with the Cortex-A15 and a 20 percent saving compared with the A57, at the same clock speeds. Milliwatts per core have dropped from the A57, to around 700mW at 2.5GHz. The design takes up 10 percent less area than the A57, which will also help save on costs.
- Branch predictor – designed to speed up processing by predicting which branch of instructions to execute and to avoid stalls.
- Decode – determines which instruction is being performed and breaks this up into dedicated operands for other parts of the CPU. The width refers to the number of concurrent executions.
- Dispatch – Dispatches operands to the correct logic (execution) unit, such as the integer or floating point unit.
ARM is also increasingly focused on its POP IP, you’ll see quite a few references to TMSC’s 16nm FinFET Plus manufacturing node in the examples. As well as substantial energy savings, ARM reckons that the A72 will be able to sustain 2.5GHz clocks on the new 16nm process, whilst keeping within the limited smartphone power budget. It’s the additional power efficiency and resulting lower heat profile that will really help the A72 achieve higher clock speeds than a 16nm A57.
We’re also a little wiser about the change to the naming convention too. ARM is looking to differentiate its high performing designs from their lower energy counterparts. The A53 and A57 are quite different in their design and intended applications, so switching the more powerful cores over to the A7x naming scheme should help avoid any confusion in the future.
The key point to take-away is that ARM has focused heavily on improving power and area efficiency with the A72, which is always welcome in mobile products. This also has the added benefit of allowing the chip to run cooler and to be clocked slightly higher than its predecessor. MediaTek and Qualcomm have already announced Cortex-A72 based mobile SoCs, which are expected land on the market towards the end of 2015, we should also see Cortex-A72 powered high-end mobile products in early 2016.