It’s a Trap – More on Chip Benchmarks

Last week we wrote about some of the games marketing departments when publishing statistics on benchmarks for their products. Last week, Intel walked right into the trap that we had apparently set for them. The company released a set of benchmarks showing their latest generation i7 chip in comparison to Apple’s M1. Here is a good overview of those test from Tom’s Hardware (which is the benchmark for benchmarks) and a good summary from Engadget. It should come as no surprise that Intel’s slides show them blowing the Apple M1. So everyone must be wrong about Apple’s homegrown chip, right?

Not quite.

Let’s walk through some of the tricks Intel used.

Writing their own test – Intel performed a series of tests largely using their own set of benchmarking tools. On the one hand, it is entirely legitimate that a company would write its own benchmarks. These are important tools for internal use. They require significant resources to compile and so there are often no comparable benchmarks available form third parties. On the other hand, no one outside Intel knows exactly what is inside those tools. So maybe, and we are just speculating here, it is possible that some aspect of those tools favor Intel.

Thumb on the scale (aka the VW Approach)Often Sometimes companies design features into their products to optimize their performance on benchmarks. Auto maker VW famously did this with emissions testing for their diesel engines. People went to jail for that, but those benchmarks were set by regulators. Nothing so sinister is happening here. Instead, Intel was testing software applications that make use of special features in their silicon. This would be meaningful if use of these features was widespread, but in this case they are not. So actual results for actual users are likely to be much less impressive. In all fairness, Intel is not the only chip company that does this. Without naming names, we know one company that writes special code into their chips when the chip senses it is being benchmarked for wireless communications, and they made a lot of use of these tests in comparison to a comparable Intel product.

Apples to Oranges – Intel compared the performance of games and here the results were mixed, with Apple ahead on some games and behind on others. So Intel then added a section (on the same graphic) showing performance for games that were not available on Mac, with Apple set to ‘0’. This is a fair point, there are a lot of games that never make their way to Mac, and this has been a problem for Apple for years. (Problem in the sense that they do not really care.) That being said, this issue has nothing to do with performance and is merely trying to conflate two issues. We would add a word of caution to whomever in the Intel marketing department that came up with this. We think it is 100% likely that at least one of the titles on this list will come out with an Apple version and when benchmarked will show the Apple M1 performing well ahead of Intel, and a decent likelihood that title will appear in a future Apple keynote.

Bait and Switch – When Intel tested battery life they swapped out hardware. They actually used an older MacBook Air, rather than the new MacBook Pro with the M1 chip. Almost every review we have seen for the M1 (including Tom’s Hardware) show MUCH better performance from the M1. Apparently, Intel switched hardware a few times through these tests, and this is a really bad sign. Given all the other tricks things they did in these benchmarks, we are surprised Intel had to actually play these sorts of games. For us, this is a clear sign that the battery life, and likely many other forms of performance, are much better on the M1.

Normally, we would not troll critique a company to quite this level. The fact that we published a post a week ago with the words “Lies” and “Benchmark” in the title, and then Intel went and demonstrated that point so soon after, left us with the sense that we could not pass up the timing. And in all fairness, all companies play some version of these games, highlighting their strengths, de-emphasizing their weakness. We think it is useful to call out some of the common sins practices of benchmark marketing. Moreover, Intel has earned a reputation over decades for the FUD (Fear, Uncertainty and Doubt) they have spread. There are no innocents in this story.

For its part, Apple also does a version of this, albeit they have a reputation for being much more honest straightforward with their data. Apple has been dealing with these performance games for essentially its entire existence, which someone should take as a lesson in the futility of benchmarking as marketing tool. And for what’s it worth, bear in mind that Intel is comparing its “11th generation product” to Apple’s first generation product. Given our view that Apple is the best run semiconductor company, it is reasonable to assume their chips’ performance is only going to get better. Perhaps the biggest thing missing in these Intel slides is the fact that Apple can optimize so much of its software that it can build a massive advantage for things like its Safari browser. Intel cannot benchmark against those, and probably would not want to publish the results if it could.

2 responses to “It’s a Trap – More on Chip Benchmarks

  1. I’m not sure why everyone is so quick to rush to Apple’s defense when they did the exact same thing in their press releases and videos. It was actually so incorrect that Apple needed to change their claims about having “the fastest CPU core around”. Their power-per-watt comparison is a joke to who uses their computer for something besides web browsing, and quite frankly the bechmarks coming out of the M1 are pretty disappointing for the world’s first 5nm chip. Intel is still able to out-engineer Apple, despite the fact that they’re working on silicon nearly 6 years old. Oh, and the M1 can’t virtualize x86? It’s not even a fair comparison at this point.

  2. Pingback: Heterogeneous Compute | Digits to Dollars·

Leave a Reply