Android is capable of running on three different types of processor architecture: Arm, Intel, and MIPS. The former is today’s ubiquitous architecture after Intel abandoned its handset CPUs, while MIPS processors for phones haven’t been seen for years. Arm is the CPU architecture used by all modern smartphones in both the Android and Apple ecosystems. Arm processors are also making their way into the PC market via Windows on Arm and Apple’s custom M1 CPU for Macs. With the Arm vs Intel CPU war about to heat up big time, here’s everything you need to know about Arm vs x86.
CPU architecture explained
The Central Processing Unit (CPU) is the “brains” of your device, but it’s not exactly smart. A CPU only works when given very specific instructions — suitably called the instruction set — which tells the processor to move data between registers and memory or to perform a calculation using a specific execution unit (such as multiplication or subtraction). Unique CPU hardware blocks require different instructions. These tend to scale up with more complex and powerful CPUs. Desired instructions can also inform hardware design, as we’ll see in a moment.
What is an SoC? Everything you need to know about smartphone chipsets
Applications that run on your phone aren’t written in CPU instructions; that would be madness with today’s large cross-platform apps than run on a variety of chips. Instead, apps written in various higher-level programming languages (like Java or C++) are complied for specific instruction sets so that they run correctly on Arm or x86 CPUs. These instructions are further decoded into microcode ops within the CPU, which requires silicon space and power. If you want the lowest power CPU, keeping the instruction set simple is paramount. However, higher performance can be obtained from more complex hardware and instructions at the expense of power. This is a fundamental difference between Arm’s and Intel’s approaches to CPU design.
x86 traditionally targets peak performance, Arm energy efficiency
Arm is RISC (Reduced Instruction Set Computing) based while Intel (x86) is CISC (Complex Instruction Set Computing). Arm’s CPU instructions are reasonably atomic, with a very close correlation between the number of instructions and micro-ops. CISC, by comparison, offers many more instructions, many of which execute multiple operations (such as optimized math and data movement). This leads to better performance, but more power consumption decoding these complex instructions.
This link between instructions and processor hardware design is what makes a CPU architecture. This way, CPU architectures can be designed for different purposes, such as extreme number crunching, low energy consumption, or minimal silicon area. This is a key difference when looking at Arm vs x86 in terms of CPUs, as the former is based on a lower power, instruction set, and hardware.
Modern 64-bit CPU architectures
Today, 64-bit architectures are mainstream across smartphones and PCs, but this wasn’t always the case. Phones didn’t make the switch until 2012, around a decade after PCs. In a nutshell, 64-bit computing leverages registers and memory addresses large enough to use 64-bit (1s and 0s) long data types. As well as compatible hardware and instructions, you also need a 64-bit operating system too, such as Android.
Industry veterans may remember the hoopla when Apple introduced its first 64-bit processor ahead of its Android rivals. The move to 64-bit didn’t transform day-to-day computing. However, it is important to run math efficiently when using high-accuracy floating-point numbers. 64-bit registers also improve 3D rendering accuracy, encryption speed, and simplifies addressing more than 4GB RAM.
Today, both architectures support 64-bit, but it's more recent in mobile
PCs moved to 64-bit well before smartphones, but it wasn’t Intel that coined the modern x86-64 architecture (also known as x64). That accolade belongs to AMD’s announcement from 1999, which retrofitted Intel’s existing x86 architecture. Intel’s alternative IA64 Itanium architecture dropped by the wayside.
Arm introduced its ARMv8 64-bit architecture in 2011. Rather than extend its 32-bit instruction set, Arm offers a clean 64-bit implementation. To accomplish this, the ARMv8 architecture uses two execution states, AArch32 and AArch64. As the names imply, one is for running 32-bit code and one for 64-bit. The beauty of the ARM design is the processor can seamlessly swap from one mode to the other during its normal execution. This means that the decoder for the 64-bit instructions is a new design that doesn’t need to maintain compatibility with the 32-bit era, yet the processor as a whole remains backwardly compatible.
Arm’s Heterogeneous Compute won over mobile
The architectural differences discussed above partly explain the current successes and issues faced by the two chip behemoths. Arm’s low power approach is perfectly suited to the 3.5W Thermal Design Power (TDP) requirements of mobile, yet performance scales up to match Intel’s laptop chips too. Meanwhile, Intel’s 100W TDP typical Core i7 wins big in servers and high-performance desktops, but historically struggles to scale down below 5W. See the dubious Atom lineup.
Of course, we mustn’t forget the role that silicon manufacturing processes have played in vastly improving power efficiency over the past decade either. Broadly speaking, smaller CPU transistors consume less power. Intel has been stuck trying to move past its 2014 in-house 14nm process. In that time, smartphone chipsets have moved from 20nm to 14, 10, and now 7nm designs, with 5nm expected in 2021. This has been achieved simply by leveraging competition between Samsung and TSMC foundries.
However, one unique feature of Arm’s architecture has been particularly instrumental in keeping TDP low for mobile applications — heterogeneous compute. The idea is simple enough, build an architecture that allows different CPU parts (in terms of performance and power) to work together for improved efficiency.
Arm's ability to share workloads across high- and low-performance CPU cores is a boon for energy efficiency
Arm’s first stab at this idea was big.LITTLE back in 2011 with the big Cortex-A15 and little Cortex-A7 core. The idea of using bigger out-of-order CPU cores for demanding applications and power-efficient in-order CPU designs for background tasks is something smartphone users take for granted today, but it took a few attempts to iron out the formula. Arm built on this idea with DynamIQ and the ARMAv8.2 architecture in 2017, allowing different CPUs to sit in the same cluster, sharing memory resources for far more efficient processing. DynamIQ also enables the 2+6 CPU design that’s increasingly common in mid-range chips.
Intel’s rival Atom chips, sans heterogeneous compute, couldn’t match Arm’s balance of performance and efficiency. It’s taken until 2020 for Intel’s Foveros, Embedded Multi-die Interconnect Bridge (EMIB), and Hybrid Technolgy projects to yield a competing chip design — the 10nm Lakefield. Lakefield combines a single, high-performance Sunny Cove core with four power-efficient Tremont cores, along with graphics and connectivity features. However, even this package is targeted at connected laptops with a 7W TDP, which is still too high for smartphones.
Today, Arm vs x86 is increasingly fought in the sub-10W TDP laptop market segment, where Intel scales down and Arm scales up increasingly successfully. Apple’s news that it will switch to its own custom Arm chips for Mac is a prime example of the growing performance reach of the Arm architecture, thanks in part to heterogeneous computing along with custom optimizations made by Apple.
Custom Arm cores and instruction sets
Another important distinction between Arm and Intel is that the latter controls its whole process from start to finish and sells its chips directly. Arm simply sells licenses. Intel keeps its architecture, CPU design, and even manufacturing entirely in-house. Arm, by comparison, offers a variety of products to partners like Apple, Samsung, and Qualcomm. These range from off the shelf CPU core designs like the Cortex-A78, designs built in partnership through its Arm CXC program, and custom architecture licenses that allow companies like Apple and Samsung to build custom CPU cores and even make adjustments to the instruction set.
Building custom CPUs is an expensive and involved process, but done correctly can clearly lead to powerful results. Apple’s CPUs showcase how bespoke hardware and instructions push Arm’s performance much closer to mainstream x86 and even beyond. Although Samsung’s Mongoose cores have been more contentious.
Apple intends to gradually replace Intel CPUs inside its Mac products with its own Arm-based silicon. The Apple M1 is the first chip in this effort, powering the latest MacBook Air, Pro, and the Mac Mini. The M1 boasts some impressive performance improvements, suggesting that high-performance Arm cores are capable of taking on x86 in more demanding compute scenarios. Remember though, Apple’s comparisons are for laptop-class CPUs, rather than desktops.
At the time of writing, the world's most powerful supercomputer, Fugaku, runs on Arm
Intel’s architecture remains out in front in terms of raw performance in the consumer hardware space. But Arm is now very competitive in product segments where high performance and energy efficiency remain key, which includes the server market. At the time of writing, the world’s most powerful supercomputer is running on Arm CPU cores for the first time ever. Its A64FX SoC is Fujitsu-designed and the first running the Armv8-A SVE architecture.
As we mentioned earlier, applications and software have to be compiled for the CPU architecture they run on. The historical marriage between CPUs and ecosystems (such as Android on Arm and Windows on x86) meant that compatibility was never really a concern, as apps didn’t need to run across multiple platforms and architectures. However, growth in cross-platform apps and operating systems running on multiple CPU architectures are changing this landscape.
Apple’s Arm-based Macs, Google’s Chrome OS, and Microsoft’s Windows on Arm are all modern examples where software needs to run on both Arm and Intel architectures. Compiling native software for both is an option for new apps and developers willing to invest in recompilation. To fill in the gaps, these platforms also rely on code emulation. In other words, translating code compiled for one CPU architecture to run on another. This is less efficient and degrades performance compared to native apps, but good emulation is currently possible to ensure that apps work.
After years of development, Windows on Arm emulation is in a pretty good state for most applications. Android apps run on Intel Chromebooks decently for the most part too. Apple has its own translation tool dubbed Rosetta 2 to support legacy Mac applications as well. But, all three suffer performance penalties compared to natively compiled apps.
Arm vs x86: The final word
Over the past decade of the Arm vs x86 rivalry, Arm has won out as the choice for low power devices like smartphones. The architecture is now also making strides into laptops and other devices where enhanced power efficiency is in demand. Despite losing out on phones, Intel’s low power efforts have improved over the years too, with Lakefield now sharing much more in common with traditional Arm processors found in phones.
That said, Arm and x86 remain distinctly different from an engineering standpoint and they continue to have individual strengths and weaknesses. However, consumer use cases across the two are becoming blurred as ecosystems increasingly supporting both architectures. Yet, while there’s crossover in the Arm vs x86 comparison, it’s Arm that is certain to remain the architecture of choice for the smartphone industry for the foreseeable future. The architecture is showing major promise for laptop-class compute and efficiency too.