How GameBench makes its uncheatable benchmark

December 9, 2013
21
223

GameBench-charts-leadCompanies like Samsung, LG, HTC and many others spend millions of dollars on designing and manufacturing Android smartphones. These investments are made based on the projected sales of these devices. For all of the big names this cycle of design, build, and sell is a continuous activity. This development cycle isn’t just limited to the smartphone makers, it is true for the System-on-a-Chip makers including Qualcomm, Samsung, NVIDA, MediaTek and so on. All these companies have a vested interest in the performance of their products because performance (along with features and other quantifiables like power consumption) heavily influences sales.

For example, Samsung has a very popular range of Galaxy devices with probably the most well known being the ‘S’ range. The Galaxy S3 was hugely popular, the S4 did well – but not quite as well as expected, and there are lots of rumors about the next iteration, the Galaxy S5. The problem is that companies like Samsung aren’t beyond a little bit of cheating to boost the sales of their devices.

This year Samsung has been accused of tweaking the Android firmware on their devices to detect benchmark apps like AnTuTu or Quadrant and ensure that the devices run at maximum performance (and worst battery life) while running these tests. This mean that the benchmark results were artificially boosted.

Enter stage left GameBench, a new startup company that wants to change benchmarking with an uncheatable benchmark that measures “real world performance.” Since our initial post about GameBench, we have been in contact with the company to find out more about how they plan to make these results reflect real world performance.

The first big difference about GameBench is that it uses real world games not synthetic tests. Since 70% of app revenue comes from games and nearly one third of the time users spend on their devices is actually spent playing games, it makes sense to measure how well a device plays those games, not how well it can run an artificial benchmark. GameBench picks multiple game genres such as first-person-shooter (FPS), racing games and running games. The list of games is a secret and changes with the market. The games are download from within the GameBench app, which runs non-invasively in the background and the devices are tested using several game testers (amateurs, intermediate and power gamers). Just to make sure that one of the big companies aren’t influencing one of the testers, these are also changed on a regular basis.

During the benchmarking phase the devices are tested for performance and battery life. If a manufacturer somehow did manage to artificially boosts the performance during a test it would reflect on the battery life which would dent the overall score.

The devices are used as they come out-of-the-box. There is no rooting needed and the tests are even performed at a controlled temperature to ensure that the battery life testing is completely fair.

The results of the testing has two purposes. First GameBench works with app developers, SoC makers and device manufacturers to help them increase the overall performance but without revealing the specifics about which games are used. The benchmarking collects lots of data which is useful for developers and can help them get that extra few percentage points of performance or decrease battery consumption.

gamebench-s4-htc-one-overall-score

Also the company publishes an overall score which is ranked on a curve taking into account several factors and not just the raw frame rate. The first official score to be released by the company is the comparison of the Snapdragon 600 versions of the Galaxy S4 and the HTC One along with the Lenovo K900. The S4 won with an overall score of 3696 while the HTC One scored 2840. That means that in the real world, taking into account performance and battery life, the S4 is 30 percent better at playing games that the HTC One!

Interestingly the Lenovo K900, which uses a dual-core 2Ghz Intel chip, scores just 264. Ouch!

Comments