SoC chip processor marcro shot

Modern smartphone SoCs are filled with lots of different processing components, but the two most commonly used are the Central Processing Unit (CPU) and Graphics Processing Unit (CPU). While acronyms may be similar and both crunch through plenty of data, there are major differences between GPU vs CPU technologies leaving little room for overlap.

Before diving into the key GPU vs CPU differences, let’s start with some of the core shared concepts.

Here’s what’s new in the Qualcomm Snapdragon 855

Individual CPU and GPU cores are built from a selection of sub-blocks, each of which handles certain tasks the processor will need to do. These blocks change in size and scope depending on the design’s micro-architecture. One common shared unit type is an Arithmetic Logic Unit (ALU) which crunches through mathematical operations like addition and multiplication. Other common feature units include memory access handlers (load/store), and instruction decoders and caches. However, the similarities end here. Let’s dive further into GPU vs CPU core concepts.

What is the CPU?

The best way to think about a CPU is like the brain of the machine. It’s highly flexible, keeps the show on the road, and is capable of handling a wide range of tasks. The CPU inside your phone is responsible for running all the logic and operations required by the Android operating system as well as your apps.

CPUs are often found in multiple core configurations, ranging from between four and eight in mobile, and 16 and beyond in desktop and server environments. Multi-core CPU designs allow for multiple applications and task threads to be run simultaneously, improving energy efficiency and performance capabilities. Each CPU core will run at a clock speed commonly between 2 and 3 GHz in mobile, and up to 5GHz inside desktops. CPUs can also be configured with various amounts of high-speed, close memory used to store instructions and data that are currently being used, known as cache. Cache can be individual to each CPU core or shared between CPU cores and are essential for speeding up execution and switching between tasks.

CPUs handle a wide variety of task types and are built for the common functionality used by the OS and apps.
Arm Cortex-A77 CPU core overview
As an example, each Cortex-A77 CPU features a NEON math engine, floating point unit, and 3 caches in each core, alongside standard ALUs and its branch predictor.

Inside most modern CPUs, you’ll find several ALUs designed for crunching numbers. This makes up the bulk of the transistor count. CPUs also handle and rearrange virtual memory for all the apps you’re running, making them essential tools for running an OS. CPUs also include branch predictors, which look ahead to predict the data and instructions that will be needed in the very near future. This saves on time fetching from slower RAM and is useful as CPU workloads often include loops and “if” statements that can quickly divert off into a new piece of code. You won’t find branch predictors inside many modern GPU designs because their workloads are much more deterministic.

What is the GPU?

As we just mentioned, you won’t find a branch predictor inside a GPU because the nature of the workload is different. This is the key to understanding GPU vs CPU differences. While CPUs are designed to handle a bit of everything, GPUs are built with a very specific purpose in mind – parallel data crunching for 3D graphics processing. They’re designed to be much faster and more power-efficient at this task, but as a trade-off, aren’t as flexible in their range of workloads.

GPU cores feature one or more ALUs, but they are designed quite differently to the basic CPU ALU. Instead of handling one or two numbers at a time, GPUs crunch through 8, 16, or even 32 operations at once. Furthermore, GPU cores can consist of tens or hundreds of individual ALU cores, allowing them to crunch thousands of numbers simultaneously. This is hugely beneficial when you have millions of pixels to shade on a high-resolution display.

GPUs are specialized processors designed for the parallel number crunching required by 3D rendering.
PowerVR GX6650
For example, this PowerVR GPU contains six execution cores, each with numerous number crunching ALU cores inside. Note how these cores all share the same memory and scheduler units.

These parallel computations are often grouped together in what’s known as a warp. Here, a block of data and instructions pass through this wide number crunching path together, rather than lots of separate instructions taking place at once, which would be more CPU-like. In other words, GPU architectures are designed to push through lots of similar data types at once, using single instructions to refer to mass amounts of data. Meanwhile, CPU instructions will mostly only ever reference a couple of data points at a time.

GPU clock speeds are typically lower than CPU speeds, often in the hundreds of MHz or low GHz. This is due to heat and power limitations, as mass parallel processing requires many more transistors than you’ll find in a CPU ALU. We should also note that mass math can be used for more than just graphics rendering. Video rendering, machine learning algorithms like object detection, and cryptographic algorithms can also run much faster on a parallel GPU versus more limited CPU hardware.

The Huawei HiSilicon Kirin 980 chipset.

GPU vs CPU in a nutshell

As a final analogy, think of the CPU as the Swiss army knife and the GPU as a machete. The army knife is helpful for a ton of different tasks, from cutting a rope to opening a can of beans. You probably won’t want to attempt the latter with the machete. However, you’ll want the brute force machete when you need to power through thick jungle, not the tiny little army knife.

To cater to the wide range of available tasks, CPUs tend to have large instruction sets. Their cores are also more flexible, allowing multiple apps and tasks to swap in and out and run in parallel. Meanwhile, GPUs have much smaller instruction sets and can only focus on one task at once, but can execute many more math operations in a single clock cycle. It’s all about specialization.

The bottom line is that although both CPUs and GPUs are built from transistors and process data and numbers, they are both optimized for unique purposes. Fortunately, SoCs benefit from the best of both worlds by integrating these, and many more, processing units together.