Breaking down ‘Big Data’ and Internet in the age of variety, volume, and velocity

State of the Web: Boiling down Big Data

Earlier this year, The New York Times famously declared 2012 the dawn of the “Age of Big Data” — an era when previously incomprehensible mountains of the world’s information can be distilled down into useful information. You’re already contributing to that mountain when you perform a Google search, buy something on Amazon, or upload a photo to Facebook. You benefit from companies refining it behind the scenes whenever Google finds exactly what you were looking for. Or a website displays an online ad for something you actually want. Or even when Facebook suggests people you already know as friends.

But the potential for Big Data goes much deeper, to the point where we may be able to calculate nearly all aspects of life.

Despite the countless papers, articles, and blog posts dedicated to the buzzword, Big Data remains a vague concept for most Web users. So for this week’s State of the Web, let’s take a look at a few of the most important aspects of this awesome, terrifying thing called Big Data, and what it could mean for the everyday person.

What is Big Data?


Big Data does not just refer to the amount of information available, but the ability of our computer systems to store and process this information economically. This evolutionary drop in the cost of computing power has taken “lots of data” and turned it into “Big Data,” which has a few important prerequisite qualities: variety, volume, and velocity. These terms were first attributed to Big Data by Gartner researcher Doug Laney in 2001 (PDF).

It’s data, in a general sense (variety): Let’s just get this first part out of the way: When people talk about Big Data, they aren’t necessarily referring to one type of information. In fact, they could be referring to any type of data: Facebook status updates, tweets on Twitter, digital images, closed-circuit camera feeds, medical histories, credit-card transactions, consumer purchasing histories, climate information, GPS location data, on and on.

If it can be stored on a computer, then it can be part of Big Data.

It’s big (volume): The key word, of course, is “big” — really big. In 2011 alone, we created or replicated 1.8 zettabytes (1.8 trillion gigabytes) of data — a number that is set to double in 2013, according to EMC (PDF). However, what constitutes “big” for one company is minuscule for another. Facebook, for example, currently stores over 100 petabytes (1 billion gigabytes) of images on its servers. The atomic physics experiments at CERN pump out 40 terabytes of data every second. But a recent study from data management company Actian Corporation shows that businesses that deal with large amounts of typically data define “big” as between 1 terabyte and 1 petabyte.

It’s quick — sometimes (velocity): A third aspect of Big Data is the rate at which information flows into a system. Twitter, for example, processes an average of about 5,000 tweets per second, according to the company’s open source manager Chris Aniszcyzyk. This number can jump significantly during high-profile events, like the Super Bowl or a major natural disaster. Other areas with high-velocity data include financial transactions, weather data, GPS coordinates, and sensor feeds from scientific equipment.

Big challenges

Big Data unstructured

The big question of Big Data is what use does all this information hold? At the moment, we don’t really know — and that’s what makes Big Data so exciting for so many industries. Companies already have mountains of information — much of it about you and me — but about 80 percent, by some estimates, is in a form that is difficult (but not impossible) for computers to “understand.” This data is called “unstructured,” and includes things like JPEG images, audio files, video files, and even many text files, including email, text messages, and blog posts. The challenge companies now face is figuring out how to turn their unstructured data into a usable information — a challenge they are quickly overcoming thanks to new applications, like Google’s BigQuery and Dremel tools.

What about now?

While we already benefit from Big Data every time we use the Web, the potential applications of Big Data extends far beyond obvious things like online search and ads. The areas where Big Data is expected to have the most immediate, revolutionary effects are business and health care.


Google and Facebook built their entire businesses on Big Data by creating services (search, connecting with friends) that are both derived from, and fueled by, massive amounts of data handed to them by users. The more features they offer, the larger their Big Data collections become, which in turn results in even more online products (to sell advertising around — a service that is itself powered by Big Data).

In other words, Google and Facebook figured out a way to monetize the data they collected. Big Data is their business. But countless other companies are looking to Big Data to provide insights into their business that were never before possible. Companies can use Big Data to tweak their advertising, prices, production operations, shipping activities, and hiring processes. At the moment, it seems, the possible uses of Big Data for business are only limited by the technical abilities of our computers and our imaginations.

Health care

In addition to making people money, Big Data is becoming increasingly useful in the field of medicine. Companies like DNAnexus and Appistry are looking to harness the vast amounts of data created by genome sequencing to help discover cures for disease far faster than has been possible.Startup Apixio is looking to bring medical records into the cloud to allow doctors to better choose treatments for their patients. Even IBM’s “Jeopardy!”-winning supercomputer Watson — which uses Big Data to power its artificial intelligence — is lending a helping hand, thanks to a partnership with WellPoint that will allow patients to access hoards of data to help them make health-care decisions.

Big and getting bigger

Big Data tsunami

The reach of Big Data doesn’t end there. Governments, scientists, militaries, and non-governmental organizations have all begun to tap into the vast power of Big Data. That power only increases as different data sets combine to offer more insight, to solve more problems, to answer deeper questions, to predict the future in ways that are impossible today. 

For average people like you and me, Big Data will provide countless new services and resources. It may even save our lives. But like all great advancements in human history, Big Data comes at a cost. Innumerable aspects of our lives — our habits, our moods, our medical histories, our personalities, our weakness and strengths, where we go, who we talk to, what we love and hate and fear — are all being amassed in nameless data centers around the world. This information may one day be used to assess whether or not you are good for a job, or a school, or whether you should have children. Companies and governments will surely know more about you and your future that you do. (In many ways, they already do.) So as Big Data gets even bigger, and the information squeezed out of it becomes more plentiful, profitable, and potent, we need to make sure this quickly moving tsunami of information doesn’t drown us in its wake.

Images via Pavel Ignatov/Carsten Reisinger/Bruce Rolff/Shutterstock

Smart Home

Facebook’s new Portal device can collect your data to target your ads

Facebook confirmed that its new Portal smart displays, designed to enable Messenger-enabled video calls, technically have the capability to gather data on users via the camera and mic onboard.
Home Theater

What is Hulu with Live TV? Here’s everything you need to know

Hulu with Live TV is a different take on a live TV streaming service, directly integrating live TV into the Hulu app you already know and perhaps love. We've put together a guide with everything you need to know about the service.
Home Theater

Put your home theater to the test with these spectacular Blu-ray releases

What's the point of having all of that awesome home theater gear if you can't breed a little jealousy in your friends and family? We've put together this list of fantastic Blu-rays that have the goods to drop a few jaws.
Emerging Tech

Curious how A.I. 'brains' work? Here's a super-simple breakdown of deep learning

What is deep learning? A branch of machine learning, this field deals with the creation of neural networks that are modeled after the brain and adept at dealing with large amounts of human-oriented data, like writing and voice commands.

Pixel 3, Home Hub, and Pixel Slate — our first look at all Google’s new devices

Google has taken the wraps off of a slew of new devices, including the Pixel 3 smartphones, Google Home Hub smart display, Google Pixel Slate tablet, and more. We were at the event, and took a ton of photos of all of Google's new products.

Spotify vs. Pandora: Which music streaming service is better for you?

Which music streaming platform is best for you? We pit Spotify versus Pandora, two mighty streaming services with on-demand music and massive catalogs, comparing every facet of the two services to help you decide which is best.

PayPal will soon let you withdraw cash at Walmart, but there’s a catch

PayPal has teamed up with Walmart to allow its account holders to withdraw and deposit cash at the store. The service launches at all Walmart stores across the U.S. in early November, but there's a catch.

Here's how to download a YouTube video to watch offline later

Learning how to download YouTube videos is easier than you might think. There are plenty of great tools you can use, both online and offline. These are our favorites and a step by step guide on how to use them.

Carbuying can be exhausting: Here are the best used car websites to make it easier

Shopping for a used car isn't easy, especially when the salesman is looking to make a quick sale. Thankfully, there are plenty of sites aimed at the prospective buyer, whether you're looking for a sedan or a newfangled hybrid.

How to recover Google contacts

If you accidentally deleted an important person from your Google Contacts, they might not be lost forever. Recovering them is a fairly easy process -- as long as you do it quickly. Here's how.

Afraid that Bitcoin could be a bubble? Here's how to sell what you've got

If you're investing in cryptocurrencies, it's important to have your exit strategy in place if prices start to crash. If you've decided it's time to get out or just want to learn how to sell Bitcoins, here's how to get started.

Don't take your ISP's word for it: Here's how to test your internet speed

If you're worried that you aren't getting the most from your internet package, speed tests are a great way to find out what your real connection is capable of. Here are the best internet speed tests available today.
Movies & TV

'Prime'-time TV: Here are the best shows on Amazon Prime right now

There's more to Amazon Prime than free two-day shipping, including access to a number of phenomenal shows at no extra cost. To make the sifting easier, here are our favorite shows currently streaming on Amazon Prime.

Your ‘Do Not Track’ tool might be helping websites track you, study says

New research from the "Do Not Track" features embedded in popular browsers are being ignored, opening up the possibility of consumers having their information targeted by specific ads based on their web histories and cookies.