Skip to main content

Opinion: Performance benchmarks are worthless, here’s how to make them better

Performance-benchmarks-are-worthless-Here’s-how-to-make-them-better
Image used with permission by copyright holder

AMD is getting ready to launch its next-generation architecture (code named Trinity) and invited a bunch of us out to Austin to see it. I can’t talk about the technology until it is launched, but one of the events at the show was a head-to-head comparison between this AMD technology and Intel’s top-shelf products.In each test (including productivity, video enhancement, and file compression) AMD Trinity technology wasn’t just faster, it was substantially faster.

Though the demonstration was impressive,it also reminded me why benchmarks really aren’t that useful anymore. Not only do they fail to reflect what each of us individually do, they don’t factor in cost, device size, or design, each of which might be more important than any direct performance measure.

For instance, Apple hasn’t led benchmarks in years. Side-by-side with competitors, the iPad and iPhone actually tend to appear relatively slow (they often use older networking, processor, or storage technology). They are also relatively expensive, yet lots of folks still prefer them, suggesting benchmarks as they currently exist are worthless to these buyers. They rank other things higher.

So what would a perfect benchmark look like?

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

How do you work?

The perfect benchmark would be derived from an ongoing analysis of how you use your hardware. We all change as we age, and even change what we do from day to night,from weekdays to weekends, and on vacation, so the capture should occur over a period of time.

It should also look for critical points, like what annoys us and what thrills us — not only in terms of what we are doing, but what we are talking about. In short, factor in our social-networking activity in things like Facebook and Pinterest.

Finally it would rank all aspects of our interest and factor in cost, not only the cost of buying the product, but the cost in time of putting the product into service, maintaining it, and our sensitivity to down time.

Analyzing the device

Since it has proven impractical to go into a store and run a benchmark on a shelved PC, and impossible to do the same thing if we want to buy online, the ideal benchmark would also need to capture the performance of systems on the market. Against this objective data, it would also capture subjective data on design, expected reliability, and time to obsolescence. While the latter two could come from historic data (much like Consumer Reports does with its ranks), the design analysis would be based on what someone similar to you in terms of personality type and taste would rank the product.

Finally, given that we live in an online “cloud” world, a major portion of the data captured would need to be on the services the device connected to, the apps it would load, and the overall end-to-end user experience.

In the end, everything would be mathematically rendered.

The result

The result would be accessible on a site where you could go, log in, and specify either the type of product you were looking for, or enter a number of products you were looking at. The system would then give you a set of choices listing the key analytical elements of each. So if you saw something that wasn’t current, or you didn’t agree with, you could change the element and thus change the ranking.

You could see an overall ranking of around 10 products with some specific ones flagged: the lowest priced, the best match to you, and the most balanced (best value for the money as defined by your unique needs and tastes). This is also somewhat similar to what Consumer Reports tries to do, but more advanced.

You would end up with a list of top choices that would be more likely to thrill you. It could also analyze products you already own to flag when performance degraded to a point that would begin to irritate you, or when the extra performance of a new system was great enough to make it worth it for you – specifically based on your needs.

Benchmarks don’t have to suck

When I first ran into benchmarks, Intel was complaining that it built systems that were betterrounded, while AMD was using benchmarks to drive people to systems they would like less. Intel tried to get the industry to drop the benchmarks, failed, and now largely optimizes for benchmarks.

If you focus on what people want to do, you’ll provide a better experience, but still likely get slammed by benchmarks. At AMD’s event, the company was pointing to the reasons benchmarks suck.

I think the answer here is to create benchmarks that don’t suck. We have online tools that capture a ton of information about us to sell to advertisers, so it doesn’t seem to be such a stretch to use some of this technology to create a tool that makes us happier consumers. Considering all this information is compiled about usand should belong to us, it would be really nice if it were used to make us happier, rather than just milk us for money. This would be a way to do that. What do you think?

[Image credit: kk-artworks/Shutterstock]

Editors' Recommendations

Rob Enderle
Former Digital Trends Contributor
Rob is President and Principal Analyst of the Enderle Group, a forward-looking emerging technology advisory firm. Before…
Twitter’s SMS two-factor authentication is having issues. Here’s how to switch methods
A person's hands holding a smartphone as they browse Twitter on it.

It might be a good idea to review and change your two-factor authentication options for Twitter. Elon Musk's Twitter has another issue for its users to worry about.

Twitter has reportedly been having issues with its SMS two-factor authentication feature (2FA). According to Wired, beginning as early as this past weekend, some Twitter users have reported difficulties logging in to their Twitter accounts due to the app's SMS 2FA feature not working properly. Essentially, the feature relies on the app sending users an authentication code via text message, which they can then enter as a second step in the login process.

Read more
Here’s how much faster Nvidia’s RTX 4090 is at cracking passwords
Nvidia GeForce RTX 4090 GPU.

You really shouldn’t be trying to manage your own passwords when high-performance graphics cards featuring GPUs as powerful as Nvidia’s GeForce RTX 4090 could be in use by hackers. The password-cracking speed of Nvidia’s best GPU has been highlighted before but the latest revelation points out the performance compared to other graphics cards.
Security analyst and researcher Sam Croley goes by Chick3nman on Twitter where he shares information related to password security. The latest tests show the RTX 4090’s Hashcat performance is roughly eight times greater than eight GTX 1080s. Compared to Nvidia’s best GPU from the previous generation, the RTX 4090 is nearly twice as fast as the RTX 3090. The tweet was the first spotted by Tom’s Hardware.

Replying to a question in the same Twitter thread, Croley said Nvidia’s GeForce RTX 4090 GPU is more than three times faster than an AMD Radeon RX 6900 when using the hash speed benchmark Hashcat. Croley noted that the relative performance of AMD’s Radeon RX 7000 series is still unknown.

Read more
The first Windows 11 update makes a major performance issue even worse
Unsupported Windows 11 waiver.

Windows 11 just experienced its first "patch Tuesday," where Microsoft issues a patch fixing issues on the second Tuesday of every month. This time around, many hoped the patch would include fixes for an issue with Windows 11 that can tank the gaming performance of AMD processors by as much as 15%. It didn't fix the problem, though, and it may have made the situation worse.

TechPowerUp reports that the October 12 Windows 11 patch nearly doubled the level 3 (L3) cache latency of a Ryzen 7 2700X. This is the lowest-level cache on your processor, storing data streamed in from your RAM before passing it up to higher levels. According to AMD, the increased latency can result in up to a 15% drop in frame rates, especially in esports titles.

Read more