Recently it was revealed that Google’s new Pixel 2 and Pixel 2 XL come packing a hidden and not yet activated Pixel Visual Core chip. According to Google, the secondary chip is designed to compile HDR+ images 5 times faster than an application processor — with 1/10th the power consumption. The Pixel Visual Core will also handle complex imaging and machine learning tasks related to the camera, which could come to include auto image adjustments based on the scene, among other uses.

Pixel Visual Core

Pixel Visual Core was enabled with the arrival of the Android 8.1 developer preview. This is a notable development for Google. The Pixel Visual Core is the company’s first piece of custom-designed silicon to make its way into a smartphone, giving the company tighter control over its phone’s capabilities than ever before.

Two SoCs in one phone

Machine learning and a heterogeneous approach to computing — using dedicated hardware to perform certain tasks more efficiently — aren’t new concepts in the smartphone space. SoC manufacturers like Qualcomm have been pushing processing in this direction for a couple of generations and already include dedicated image signal processor (ISP) and digital signal processor (DSP) components inside its Snapdragon 835. You’ll find all of this inside the new Pixel phones. Qualcomm is already targeting these components for power efficient use with machine learning, image processing, and data crunching tasks. Clearly, Google wants to augment or surpass these capabilities.

Opting for a stand-alone processing unit is an unusual choice, suggesting that Google wants to seriously augment the Snapdragon 835's built-in DSP capabilities.

Google opting for an additional, stand-alone Image Processing Unit (IPU) is an unusual choice. Ideally these components should be closely integrated with the CPU and GPU to avoid any latency issues transferring data in and out of the processor. However Google can’t build any custom silicon into Qualcomm’s design, the only option for custom hardware is to design a secondary stand-alone SoC to communicate with the main application processor, and that’s exactly what the Vision Core does.

A look inside the Pixel Visual Core

Before even looking at the processing capabilities of the new core, there are a few telltale signs of its standalone design. There’s on-board LPDDR4 RAM to quickly read and write data without having to go to main memory, along with a PCIe bus connection for talking to an external processor. A single Cortex-A53 CPU hands incoming and outgoing communications to the main application processor.

Magnified image of Pixel Visual Core

On the image processing side, the chip consists of eight IPU cores. Google states that each of these cores packs in 512 arithmetic logic units (ALUs), granting the ability to perform more than 3 trillion operations per second in a mobile power budget. For comparison, a Cortex-A73 CPU core inside a high-end mobile application processor only contains two basic integer units, along with load/store and FPUs.

Even with heavily optimized SIMD extensions you would be lucky to maximize all of those capabilities at once on a CPU. A dedicated mass math processor will simply be faster at specific operations. The Visual Core appears to have been specifically designed for performing mass math operations across the millions of pixels in a picture, so this type of setup can be well utilized for imaging tasks. A CPU has to deal with a wider range of possible operations, so a 512 ALU design wouldn’t be practical or useful for general applications.

With 512 ALUs in each IPU core, Google's Visual Core is designed for mass parallel math, perfect for image processing and mass neural networks.

Google also states that a key ingredient to the IPU’s efficiency is the tight coupling of hardware and software. Google’s software for the Pixel Visual Core can apparently control many more details of the hardware than in a typical processor, making it quite flexible and efficient. This comes with expensive programming complexity. To assist developers, a custom Google-made compiler is used for optimization, and developers can make use of Halide for image processing and TensorFlow for machine learning.

In summary, Google’s Visual Core can crunch a lot more numbers and perform many more mathematical operations in parallel than your typical CPU. Camera imaging data arriving as 10, 12, or 14-bit tone data spread across the Pixel 2’s 12.2 megapixel camera resolution requires wide, parallel processing for color, noise reduction, sharpening, and other data processing. Not to mention newer and more advanced HDR+ and other algorithms. This very wide ALU-heavy design is also well suited to machine learning and neural networking tasks, which also require the crunching of lots of small numbers.

Google’s image processing capabilities

Although the Pixel Visual Core isn’t yet enabled inside the Google Pixel 2 or Pixel 2 XL, Google is already using intensive image processing algorithms for a number of photography features inside these phones. These algorithms should run faster and more efficiently once Google switches its custom SoC on, giving us a good starting point for the chip’s capabilities.

In a recent blog post, Google outlined its use of aligning and averaging multiple image frames to construct high dynamic range pictures from a short burst of images. This technique is used on all recent Nexus and Pixel phones that offer an HDR+ shooting mode. We already know that the Pixel Visual Core SoC will be used to speed up this capability and do it while consuming just 1/10th of the power currently used.

Google is already using machine learning and neural network algorithms in its cameras too. When creating a depth of field effect from a single image sensor, a convolution neural network, trained on almost a million pictures of faces and bodies, produces a mask of foreground and background content. This is combined with depth map data calculated from the Phase-Detect Auto-Focus (PDAF) dual-pixels located in the image sensor and a stereo algorithms to further detect areas of the background and how much blur to apply based on distance from the foreground. This is actually the computationally intensive part. Once this has all been brought together and calculated, a disk-shaped bokeh blur is applied at each depth level to finalize the image.

Wrap up

Google’s impressive photography results in its Pixel smartphones are a major selling point for the company. It’s apparent that the company has made significant investments not only in software algorithms for improving image quality, but also into hardware solutions. Not only will the Pixel Visual Core tucked inside to the new Pixels improve the performance and power efficiency of Google’s existing photography algorithms, but it could also enable entirely new features, in time.

With access to huge amounts of cloud data and content for neural network training, Google has been able to offer image enhancement software unmatched by other smartphone OEMs. The introduction of its own hardware suggests that Google may already be pressing up against the limits of the hardware that other companies can offer. A custom hardware solution better allows the company to tailor its products to its software capabilities. Whether or not Google will decide to expand its hardware development into other areas of smartphone processing in the future remains an interesting and potentially industry-shaking prospect.