Breaking down ‘Big Data’ and Internet in the age of variety, volume, and velocity

State of the Web: Boiling down Big Data

Earlier this year, The New York Times famously declared 2012 the dawn of the “Age of Big Data” — an era when previously incomprehensible mountains of the world’s information can be distilled down into useful information. You’re already contributing to that mountain when you perform a Google search, buy something on Amazon, or upload a photo to Facebook. You benefit from companies refining it behind the scenes whenever Google finds exactly what you were looking for. Or a website displays an online ad for something you actually want. Or even when Facebook suggests people you already know as friends.

But the potential for Big Data goes much deeper, to the point where we may be able to calculate nearly all aspects of life.

Despite the countless papers, articles, and blog posts dedicated to the buzzword, Big Data remains a vague concept for most Web users. So for this week’s State of the Web, let’s take a look at a few of the most important aspects of this awesome, terrifying thing called Big Data, and what it could mean for the everyday person.

What is Big Data?


Big Data does not just refer to the amount of information available, but the ability of our computer systems to store and process this information economically. This evolutionary drop in the cost of computing power has taken “lots of data” and turned it into “Big Data,” which has a few important prerequisite qualities: variety, volume, and velocity. These terms were first attributed to Big Data by Gartner researcher Doug Laney in 2001 (PDF).

It’s data, in a general sense (variety): Let’s just get this first part out of the way: When people talk about Big Data, they aren’t necessarily referring to one type of information. In fact, they could be referring to any type of data: Facebook status updates, tweets on Twitter, digital images, closed-circuit camera feeds, medical histories, credit-card transactions, consumer purchasing histories, climate information, GPS location data, on and on.

If it can be stored on a computer, then it can be part of Big Data.

It’s big (volume): The key word, of course, is “big” — really big. In 2011 alone, we created or replicated 1.8 zettabytes (1.8 trillion gigabytes) of data — a number that is set to double in 2013, according to EMC (PDF). However, what constitutes “big” for one company is minuscule for another. Facebook, for example, currently stores over 100 petabytes (1 billion gigabytes) of images on its servers. The atomic physics experiments at CERN pump out 40 terabytes of data every second. But a recent study from data management company Actian Corporation shows that businesses that deal with large amounts of typically data define “big” as between 1 terabyte and 1 petabyte.

It’s quick — sometimes (velocity): A third aspect of Big Data is the rate at which information flows into a system. Twitter, for example, processes an average of about 5,000 tweets per second, according to the company’s open source manager Chris Aniszcyzyk. This number can jump significantly during high-profile events, like the Super Bowl or a major natural disaster. Other areas with high-velocity data include financial transactions, weather data, GPS coordinates, and sensor feeds from scientific equipment.

Big challenges

Big Data unstructured

The big question of Big Data is what use does all this information hold? At the moment, we don’t really know — and that’s what makes Big Data so exciting for so many industries. Companies already have mountains of information — much of it about you and me — but about 80 percent, by some estimates, is in a form that is difficult (but not impossible) for computers to “understand.” This data is called “unstructured,” and includes things like JPEG images, audio files, video files, and even many text files, including email, text messages, and blog posts. The challenge companies now face is figuring out how to turn their unstructured data into a usable information — a challenge they are quickly overcoming thanks to new applications, like Google’s BigQuery and Dremel tools.

What about now?

While we already benefit from Big Data every time we use the Web, the potential applications of Big Data extends far beyond obvious things like online search and ads. The areas where Big Data is expected to have the most immediate, revolutionary effects are business and health care.


Google and Facebook built their entire businesses on Big Data by creating services (search, connecting with friends) that are both derived from, and fueled by, massive amounts of data handed to them by users. The more features they offer, the larger their Big Data collections become, which in turn results in even more online products (to sell advertising around — a service that is itself powered by Big Data).

In other words, Google and Facebook figured out a way to monetize the data they collected. Big Data is their business. But countless other companies are looking to Big Data to provide insights into their business that were never before possible. Companies can use Big Data to tweak their advertising, prices, production operations, shipping activities, and hiring processes. At the moment, it seems, the possible uses of Big Data for business are only limited by the technical abilities of our computers and our imaginations.

Health care

In addition to making people money, Big Data is becoming increasingly useful in the field of medicine. Companies like DNAnexus and Appistry are looking to harness the vast amounts of data created by genome sequencing to help discover cures for disease far faster than has been possible.Startup Apixio is looking to bring medical records into the cloud to allow doctors to better choose treatments for their patients. Even IBM’s “Jeopardy!”-winning supercomputer Watson — which uses Big Data to power its artificial intelligence — is lending a helping hand, thanks to a partnership with WellPoint that will allow patients to access hoards of data to help them make health-care decisions.

Big and getting bigger

Big Data tsunami

The reach of Big Data doesn’t end there. Governments, scientists, militaries, and non-governmental organizations have all begun to tap into the vast power of Big Data. That power only increases as different data sets combine to offer more insight, to solve more problems, to answer deeper questions, to predict the future in ways that are impossible today. 

For average people like you and me, Big Data will provide countless new services and resources. It may even save our lives. But like all great advancements in human history, Big Data comes at a cost. Innumerable aspects of our lives — our habits, our moods, our medical histories, our personalities, our weakness and strengths, where we go, who we talk to, what we love and hate and fear — are all being amassed in nameless data centers around the world. This information may one day be used to assess whether or not you are good for a job, or a school, or whether you should have children. Companies and governments will surely know more about you and your future that you do. (In many ways, they already do.) So as Big Data gets even bigger, and the information squeezed out of it becomes more plentiful, profitable, and potent, we need to make sure this quickly moving tsunami of information doesn’t drown us in its wake.

Images via Pavel Ignatov/Carsten Reisinger/Bruce Rolff/Shutterstock


Intel answers Qualcomm's new PC processors by pairing Core and Atom in 'Foveros'

Intel has announced a new packaging technology called 'Foveros' that makes it easier for the company to place multiple chips together on one package. That includes chips based on different Intel architectures, like Core and Atom.

Our favorite fitness trackers make it fun to keep moving

Looking for your first fitness tracker, or an upgrade to the one you're already wearing? There are plenty of the wrist-worn gadgets available. Here are our picks for the best fitness trackers available right now.

Spotify is the best streaming service, but its competitors aren’t far behind

It can be hard to decide which music streaming service is for you, so we've picked out the individual strengths of the most popular services, aiming to make your decision a little easier.

These are the best action cameras money can buy

Action cameras are great tools for capturing videos of your everyday activities, whether it's a birthday party or the steepest slope you've ever descended on your snowboard. These are the best money can buy.

Edit, sign, append, and save with six of the best PDF editors

There are plenty of PDF editors to be had online, and though the selection is robust, finding a solid solution with the tools you need can be tough. Here, we've rounded up best PDF editors, so you can edit no matter your budget or OS.

How to easily record your laptop screen with apps you already have

Learning how to record your computer screen shouldn't be a challenge. Lucky for you, our comprehensive guide lays out how to do so using a host of methods, including both free and premium utilities, in both MacOS and Windows 10.

Google Translate updated to reduce gender bias in its translations

Google is changing how Google Translate offers translations. Previously when you entered a word like doctor, Translate would offer a masculine interpretation of the word. Now, Translate will offer both masculine and feminine versions.

From beautiful to downright weird, check out these great dual monitor wallpapers

Multitasking with two monitors doesn't necessarily mean you need to split your screens with two separate wallpapers. From beautiful to downright weird, here are our top sites for finding the best dual monitor wallpapers for you.

Encryption-busting law passed in Australia may have global privacy implications

Controversial laws have been passed in Australia which oblige tech companies to allow the police to access encrypted messages, undermining the privacy of encryption with potentially global effects.

Can Microsoft’s Airband Initiative close broadband gap for 25M Americans?

A new report from the Federal Communications Commission (FCC) says that 25 million Americans do not have access to broadband internet. Of these, more than 19 million are living in rural communities. Can Microsoft help out?

Microsoft’s Chromium Edge browser may be adding your Chrome extensions

Fans sticking to Google Chrome because due to its vast extension library might be able to switch over to Microsoft's latest iteration of Edge, as a project manager confirms that the company has its eyes on Chrome extensions.

If you've lost a software key, these handy tools can find it for you

Missing product keys getting you down? We've chosen some of the best software license and product key finders in existence, so you can locate and document your precious keys on your Windows or MacOS machine.

Google+ continues to sink with a second massive data breach. Abandon ship now

Google+ was scheduled to shut its doors in August 2019, but the second security breach in only a few months has caused the company to move its plan forward a few months. It might be a good idea to delete your account sooner than later.
Social Media

‘YouTube Rewind 2018’ is about to become its most disliked video ever

YouTube is about to achieve a record it really doesn't want — that of "most-disliked video." Yes, its annual recap of featuring popular YouTubers has gone down really badly this year.