It’s no secret that the advertising industry is driven by numbers and statistics: page impressions, viewership, size of audience, readership estimates, circulation, numbers of viewers, ROI, CPM, and a passel of other acronyms, metrics, and confusing numbers. While we in the technology industry are used to citing and tossing off many statistics about how many people use particular technologies, play particular games, or own particular devices, the fact is that the numbers cited in the vast majority of market research reports are, at best, well-informed guesses. You’ve probably heard the aphorism, "There’s lies, damn lies, and statistics" used as a quick way to sum up the persuasive power of a little numerical analysis to bolster an argument…even when the numbers and the analysis are often quite questionable? Add to that the pressure-cooker environment of ad sales, tight deadlines, and careers hanging in the balance, and you’ll often find hand-waving, smoke, and mirrors passed off as cold hard facts.
Measuring online audiences is also a pretty hazy thing: unless you can somehow force every Internet user to register with your site, magically force them not to share their accounts, and somehow truthfully profile them for key demographic information like their age, income, and geographic location, you’re never going to have a 100 percent accurate idea of who uses your site. For firms which purport to meaure audience share of, say, search engines or video sharing services, the task is even more daunting because its spread across millions of users and thousands of high traffic sites. The firms attempt to take accurate snapshots of Internet usage by profiling a small subset of users, and generalizing those results to the larger Internet population. That methodology can work if your sample size is large enough and sufficiently random—but, on the Internet, there’s essentially no way to know whether your sample is big enough or random enough, and there are innumerable invisible factors which might make your samples biased and, therefore, unrepresentative.
Last month, Internet Advertising Bureau president and CEO Randall Rothenberg decided to call two of the largest online media metrics firms—Nielsen NetRatings and comScore Media Metrics—to the carpet—the two firms’ numbers have never lined up very well, and neither jive with the server logs maintained by the IAB’s members. In an open letter, Rothenberg demanded that the firms reveal the methodologies behind their Internet audience metrics, and essentially prove that the numbers they’re publishing for Internet audiences—upon which increasing billions of advertising dollars are being spend—are accurate.
"To persist in using panels that potentially undercount or ignore the diverse populations that are the future of consumer marketing is to deny marketers the insights they need to build their businesses," wrote Rothenberg. "And it certainly appears to us as if these audiences are being undercounted or disregarded."
Historically, audience measurement and survey firms (particularly Nielsen) have been notoriously close-lipped about their methodologies.
In response, Nielsen NetRatings has announced it will submit its Internet audience measurement to the Media Rating Council‘s accredidation process (PDF). "The MRC process is the only audit that certifies to clients and to the industry that we have fully disclosed our methodology and that we are executing against that methodology," said Manish Bhatia, NetRatings’ executive VP. "We are confident that NetRatings’ methods will stand up to review and we are pleased to be supporting transparency and accountability in the industry."
Under the plan, Nielsen will submit its desktop meter—software its panelists use which logs Web usage—and its page-tagging technology for review by the MRC, an industry group set up in the 1960s and charged with setting minimum standards for audience measurement and ratings for broadcast media.
For its part, comScore has defended the integrity of its numbers, and has even published analyses highlighting possible causes of discrepancies between sites’ server logs and measured audiences; in particular, comScore notes "cookie deletion" may case sites to significantly overestimate their audience.
See? "Lies, damned lies, and statistics." In an tangential, ironic illustration of how misleading things can be even if they’re peripheral to statistics: that quote above is often mis-attributed to Mark Twain, but Twain himself said it came from Benjamin Disraeli.