Skip to main content

Opinion: Performance benchmarks are worthless, here’s how to make them better

Performance-benchmarks-are-worthless-Here’s-how-to-make-them-better
Image used with permission by copyright holder

AMD is getting ready to launch its next-generation architecture (code named Trinity) and invited a bunch of us out to Austin to see it. I can’t talk about the technology until it is launched, but one of the events at the show was a head-to-head comparison between this AMD technology and Intel’s top-shelf products.In each test (including productivity, video enhancement, and file compression) AMD Trinity technology wasn’t just faster, it was substantially faster.

Though the demonstration was impressive,it also reminded me why benchmarks really aren’t that useful anymore. Not only do they fail to reflect what each of us individually do, they don’t factor in cost, device size, or design, each of which might be more important than any direct performance measure.

For instance, Apple hasn’t led benchmarks in years. Side-by-side with competitors, the iPad and iPhone actually tend to appear relatively slow (they often use older networking, processor, or storage technology). They are also relatively expensive, yet lots of folks still prefer them, suggesting benchmarks as they currently exist are worthless to these buyers. They rank other things higher.

So what would a perfect benchmark look like?

How do you work?

The perfect benchmark would be derived from an ongoing analysis of how you use your hardware. We all change as we age, and even change what we do from day to night,from weekdays to weekends, and on vacation, so the capture should occur over a period of time.

It should also look for critical points, like what annoys us and what thrills us — not only in terms of what we are doing, but what we are talking about. In short, factor in our social-networking activity in things like Facebook and Pinterest.

Finally it would rank all aspects of our interest and factor in cost, not only the cost of buying the product, but the cost in time of putting the product into service, maintaining it, and our sensitivity to down time.

Analyzing the device

Since it has proven impractical to go into a store and run a benchmark on a shelved PC, and impossible to do the same thing if we want to buy online, the ideal benchmark would also need to capture the performance of systems on the market. Against this objective data, it would also capture subjective data on design, expected reliability, and time to obsolescence. While the latter two could come from historic data (much like Consumer Reports does with its ranks), the design analysis would be based on what someone similar to you in terms of personality type and taste would rank the product.

Finally, given that we live in an online “cloud” world, a major portion of the data captured would need to be on the services the device connected to, the apps it would load, and the overall end-to-end user experience.

In the end, everything would be mathematically rendered.

The result

The result would be accessible on a site where you could go, log in, and specify either the type of product you were looking for, or enter a number of products you were looking at. The system would then give you a set of choices listing the key analytical elements of each. So if you saw something that wasn’t current, or you didn’t agree with, you could change the element and thus change the ranking.

You could see an overall ranking of around 10 products with some specific ones flagged: the lowest priced, the best match to you, and the most balanced (best value for the money as defined by your unique needs and tastes). This is also somewhat similar to what Consumer Reports tries to do, but more advanced.

You would end up with a list of top choices that would be more likely to thrill you. It could also analyze products you already own to flag when performance degraded to a point that would begin to irritate you, or when the extra performance of a new system was great enough to make it worth it for you – specifically based on your needs.

Benchmarks don’t have to suck

When I first ran into benchmarks, Intel was complaining that it built systems that were betterrounded, while AMD was using benchmarks to drive people to systems they would like less. Intel tried to get the industry to drop the benchmarks, failed, and now largely optimizes for benchmarks.

If you focus on what people want to do, you’ll provide a better experience, but still likely get slammed by benchmarks. At AMD’s event, the company was pointing to the reasons benchmarks suck.

I think the answer here is to create benchmarks that don’t suck. We have online tools that capture a ton of information about us to sell to advertisers, so it doesn’t seem to be such a stretch to use some of this technology to create a tool that makes us happier consumers. Considering all this information is compiled about usand should belong to us, it would be really nice if it were used to make us happier, rather than just milk us for money. This would be a way to do that. What do you think?

[Image credit: kk-artworks/Shutterstock]

Rob Enderle
Former Contributor
Rob is President and Principal Analyst of the Enderle Group, a forward-looking emerging technology advisory firm. Before…
Google’s ChatGPT rival just launched in search. Here’s how to try it
Generative AI in Google Search.

Ever since Microsoft started integrating ChatGPT into Bing search, alarm bells have been ringing at Google. Now, though, the tech giant has started rolling out its own generative artificial intelligence (AI) tool for users as part of its bid to retain its search crown.

In a blog post, the company explains that the new feature (called Search Generative Experience, or SGE) is part of Google’s Search Labs, which lets you test out experimental ideas in Google search and provide feedback to the company. Google says its generative AI will “help you take some of the work out of searching, so you can understand a topic faster, uncover new viewpoints and insights and get things done more easily.”

Read more
Here’s how to get your free 40% performance boost from AMD
AMD RX 6600 XT on a wooden backdrop.

If you own an AMD Radeon RX 6000-series graphics card, you're in for a treat. The entire RDNA 2 range just got an unexpected performance boost that might actually make quite a difference in gaming.

The latest driver release, now available to everyone, is said to improve the ray tracing performance of RX 6000 GPUs by up to 40%. Here's everything you need to know.

Read more
Twitter’s SMS two-factor authentication is having issues. Here’s how to switch methods
A person's hands holding a smartphone as they browse Twitter on it.

It might be a good idea to review and change your two-factor authentication options for Twitter. Elon Musk's Twitter has another issue for its users to worry about.

Twitter has reportedly been having issues with its SMS two-factor authentication feature (2FA). According to Wired, beginning as early as this past weekend, some Twitter users have reported difficulties logging in to their Twitter accounts due to the app's SMS 2FA feature not working properly. Essentially, the feature relies on the app sending users an authentication code via text message, which they can then enter as a second step in the login process.

Read more