AMD is getting ready to launch its next-generation architecture (code named Trinity) and invited a bunch of us out to Austin to see it. I can’t talk about the technology until it is launched, but one of the events at the show was a head-to-head comparison between this AMD technology and Intel’s top-shelf products.In each test (including productivity, video enhancement, and file compression) AMD Trinity technology wasn’t just faster, it was substantially faster.
Though the demonstration was impressive,it also reminded me why benchmarks really aren’t that useful anymore. Not only do they fail to reflect what each of us individually do, they don’t factor in cost, device size, or design, each of which might be more important than any direct performance measure.
For instance, Apple hasn’t led benchmarks in years. Side-by-side with competitors, the iPad and iPhone actually tend to appear relatively slow (they often use older networking, processor, or storage technology). They are also relatively expensive, yet lots of folks still prefer them, suggesting benchmarks as they currently exist are worthless to these buyers. They rank other things higher.
So what would a perfect benchmark look like?
How do you work?
The perfect benchmark would be derived from an ongoing analysis of how you use your hardware. We all change as we age, and even change what we do from day to night,from weekdays to weekends, and on vacation, so the capture should occur over a period of time.
It should also look for critical points, like what annoys us and what thrills us — not only in terms of what we are doing, but what we are talking about. In short, factor in our social-networking activity in things like Facebook and Pinterest.
Finally it would rank all aspects of our interest and factor in cost, not only the cost of buying the product, but the cost in time of putting the product into service, maintaining it, and our sensitivity to down time.
Analyzing the device
Since it has proven impractical to go into a store and run a benchmark on a shelved PC, and impossible to do the same thing if we want to buy online, the ideal benchmark would also need to capture the performance of systems on the market. Against this objective data, it would also capture subjective data on design, expected reliability, and time to obsolescence. While the latter two could come from historic data (much like Consumer Reports does with its ranks), the design analysis would be based on what someone similar to you in terms of personality type and taste would rank the product.
Finally, given that we live in an online “cloud” world, a major portion of the data captured would need to be on the services the device connected to, the apps it would load, and the overall end-to-end user experience.
In the end, everything would be mathematically rendered.
The result
The result would be accessible on a site where you could go, log in, and specify either the type of product you were looking for, or enter a number of products you were looking at. The system would then give you a set of choices listing the key analytical elements of each. So if you saw something that wasn’t current, or you didn’t agree with, you could change the element and thus change the ranking.
You could see an overall ranking of around 10 products with some specific ones flagged: the lowest priced, the best match to you, and the most balanced (best value for the money as defined by your unique needs and tastes). This is also somewhat similar to what Consumer Reports tries to do, but more advanced.
You would end up with a list of top choices that would be more likely to thrill you. It could also analyze products you already own to flag when performance degraded to a point that would begin to irritate you, or when the extra performance of a new system was great enough to make it worth it for you – specifically based on your needs.
Benchmarks don’t have to suck
When I first ran into benchmarks, Intel was complaining that it built systems that were betterrounded, while AMD was using benchmarks to drive people to systems they would like less. Intel tried to get the industry to drop the benchmarks, failed, and now largely optimizes for benchmarks.
If you focus on what people want to do, you’ll provide a better experience, but still likely get slammed by benchmarks. At AMD’s event, the company was pointing to the reasons benchmarks suck.
I think the answer here is to create benchmarks that don’t suck. We have online tools that capture a ton of information about us to sell to advertisers, so it doesn’t seem to be such a stretch to use some of this technology to create a tool that makes us happier consumers. Considering all this information is compiled about usand should belong to us, it would be really nice if it were used to make us happier, rather than just milk us for money. This would be a way to do that. What do you think?
[Image credit: kk-artworks/Shutterstock]
For someone that doesn’t know anything about technology, your idea has weight. But for those of us who understand the numbers and build our own systems, the performance benchmarks of each component we select is chosed because we know how all of them interact and how they will be used in each particular build. They are how we judge what will be most effective for each individual machine. Completely replacing them therefore is out of the question.
I’ll give you that for the top 3% of users who still build there own systems like we do, yes I build mine as well, they can be useful. However even for us we often don’t measure what we really need only make assumptions that we apply benchmarks to. In fact, even for most of us techies we tend to favor usage based benchmarks like those tied directly to games and applications we use than synthetic benchmarks because we’ve learned that synthetic benchmarks often don’t relate closely to how an application will work or a top game will play. But for the other 97% benchmarks don’t relate to what they actually do.
You’re missing the point of a benchmark. The ideas you presented are analyses. “For doing x, y is best.” That’s an analysis. “In running eleventy billion iterations of x, y is the fastest.” That’s a benchmark.
Benchmarking is a form of analysis. They aren’t mutually exclusive you know. How do you think analysts compare products?
Simple math. Which vendor paid the analyst more?
Hard to cheat on a benchmark for an analyst, the losing company tends to call foul pretty quickly but it does happen. The bigger issue is who had the most input into setting the benchmark in the first place, and which vendor optimized for the benchmark. Either can result in a score that is higher than it should be. But in the end, you need to measure system performance as a hit processor in otherwise slow system will bottleneck at memory, magnetic storage, or network speed. Vendors do this when buyers get overly focused on one part.
Benchmarks provide performance metrics – and nothing more. Some benchmarks are more realistic than others and all benchmarks are applicable to only some situations. Its the job of the reviewer to provide an interpretation of those benchmarks.
What you’re doing is asking benchmarks to write the review by themselves and, on top of that, provide long-term data about reliability and value. Which is a bit like asking your mechanic to give you a haircut while you wait for an oil change.
Actually no, I’m asking that we use current technology to create benchmarks that are customized to how we work. We actually do that for race cars and drivers, I think we can scale that with social tech and the web. Agree it won’t be easy though.
So are you saying that performance benchmarks should have no role and be completely replaced by these new ones that you suggest or can both be used together to make an informed decision.
Replaced, performance benchmarks are generic and we humans tend not to be.
Good points, I think another major thing to take in to account is the overall eco-system that a device works on. This is starting to be a lot more relevant with Apple having a full line of devices that all work together. I think Acer is one of the only Android-based companies even comes close to Apple’s ecosystem.
Honestly though, I care much more about how all of my devices will work together, than how powerful my processor is.
I agree, for most it is increasingly about the eco-system what the hardware does just isn’t as important as it once was.
Kid: My teacher says beauty is on the inside
Father: That’s just what ugly people say.
AMD… is the ugly people.
Your teacher isn’t a very nice person. But, to point, this is more about discussing beauty contests that are fixed. Contestants are all pretty, one who wins is sleeping with the guy who owns the contest.