ARM bringing Energy-Aware Scheduling to Android
Preemptive multitasking was first implemented in the late 1960s. In those days spirits were brave, the stakes were high, men were real men, women were real women, and small furry creatures from Alpha Centauri were real small furry creatures from Alpha Centauri. (Douglas Adams, The Hitchhiker’s Guide to the Galaxy). Although many dared to do mighty deeds, it was sadly a different era. There was but just one CPU, multi-core processors didn’t exist and the idea that the operating system needed to take into account the amount of energy available was but a twinkle in some clever engineer’s eye.
Zoom forward several decades and we live in an age where mobile phones have more computing power than NASA used to go to the moon, processors are multi-core, and the scheduler now needs to consider battery life.
In 2014 ARM released a new piece of tech for the Linux kernel called Intelligent Power Allocation (IPA). Keeping a SoC at its nominal operating temperatures is essential for fanless designs (like your smartphone or tablet). The busier a processor gets, the more heat it generates. In a hexa-core or octa-core processor you will find high performance “big” cores (like the Cortex-A72 or the Cortex-A57), and energy efficient “LITTLE” cores (like the Cortex-A53 or soon the Cortex-A35), plus there is the GPU. These three different parts of the SoC can be controlled independently and by controlling them in unison a better power allocation scheme can be created.
This means that if the CPU is generating too much heat then the scheduler might favor the LITTLE cores to reduce the power used. If the GPU is being used a lot, but the CPU isn’t, then there is room in the thermal budget to let the GPU run at full speed, since the CPU isn’t using all of its allocation.
ARM’s patches were duly accepted into the mainstream Linux kernel and the benefits can be quite substantial. According to ARM’s tests, using IPA can increase the performance of a SoC by as much as 36%. The reason the performance goes up is because the SoC is being tuned dynamically and every bit of the thermal budget is used. This means that the CPU or the GPU is able to run at maximum speed whenever the thermal budget allows.
That was 2014. In 2015 and on into 2016 ARM has started working with the Linux community on the next phase, called Energy Aware Scheduling (EAS). The EAS project is trying to solve a long-standing design limitation of the two key Linux power-management subsystems (CPUFreq and CPUIdle), namely they don’t coordinate their decisions with the scheduler.
In fact it is a little worse than that, the scheduler, CPUFreq and CPUIdle subsystems work in isolation, at different time scales and often at cross-purposes with each other. As the scheduler tries to balance the load across all CPUs, the CPUFreq and CPUIdle subsystems are trying to save power by scaling down the frequency of the CPUs or idling them.
The idea behind EAS is to integrate the CPUFreq (i.e. DVFS), CPUIdle and the IPA subsystems. The new design will be based on measurable energy model data rather than magic tunables; will support future CPU topologies (including per core/per cluster Dynamic Voltage & Frequency Scaling); and will be maintained in the main Linux kernel.
What this means in real terms is that if a new task is waiting for some CPU time then the ESA might assign the task to a core that is actually active rather than using an idle core, this then means that the other idle cores don’t need to be spun-up, which takes power. Previously, when the scheduler was working only for max performance then it might have opted to place the incoming task onto a idle CPU core, as it is free and unloaded, thus giving the best performance.
Initial testing has shown that energy usage can be much lower when compared to the current mainline Linux kernel while maintaining the same level of performance. In fact under some loads the performance can actually increase slightly.
This is on going work. The first generation of EAS is expected during 2016, but after that ARM and Linaro have plans for a second generation implementation with even tighter iteration.