How the CDC could use Google, AI, and even Twitter to forecast flu outbreaks

vaccine blowing nose
Eugenio Marongiu/Getty Images
As summer gives way to fall, flu season is about to be upon us. Proper preparation is essential if there’s to be enough medical professionals and vaccinations to go around. The Centers for Disease Control and Prevention play a huge role in making sure practices and hospitals around the country know what to expect.

The CDC needs all the information that it can get to do this important work. Now, machine learning is bringing together a staggering amount of data — comprising everything from retail sales of flu medication to Google searches about symptoms — to create the best possible picture of the spread of the virus, as it happens. If it works, it could make predicting the spread of disease as commonplace as forecasting tomorrow’s thunderstorms.

Forecast face-off

Over the last four years, the CDC has run a forecasting research initiative intended to build better methods of predicting what flu season will bring.

Participants are invited to submit their own forecasting systems, which are judged stringently based on their accuracy. Each system needs to forecast when the season will start, when it’s going to peak, how bad it will be at its peak, and how bad it will be in one week’s time, two weeks’ time, three weeks’ time, or four weeks’ time.

The scope of this research goes well beyond the flu.

After that, participants are asked to submit a new forecast for each of these seven criteria every week through flu season, using new data that has been collected. Forecasts need to be made for each of ten regions comprising the U.S.

Once the flu season comes to an end, the forecasts are compared with the actual data that was collected. A total of 28 different systems were submitted to the CDC this year. Two of them were developed by Carnegie Mellon University’s Delphi research group, led by Roni Rosenfeld — and those two projects took both the number one and number two spots in the final ranking.

The CDC currently tracks the flu using a surveillance system. The key difference is that surveillance only looks at what’s happening right now, while forecasting can make a probabilistic statement about what’s going to happen in the future. The work being done by the Delphi group, among others, is poised to make a huge impact on the organization’s ability to plan for flu season – and the scope of this research goes well beyond the flu.

Sources of infection

There’s two main strands to the work the Delphi group is doing in conjunction with the CDC. The first is an improvement to the organization’s current surveillance techniques, which Rosenfeld refers to as ‘nowcasting.’ The aim is to make this data available in as close to real-time as possible, without sacrificing any accuracy.

“It takes a while to collate all these numbers, compile them, check them, and publish them,” Rosenfeld explained in a phone call with Digital Trends. “So as a result, when the CDC publishes their surveillance numbers online, they actually refer to the previous week, not the week that we’re in. So, they’re already between one and two weeks old.”

flu forecasting shot
Vladimir Gerdo/Getty Images
Vladimir Gerdo/Getty Images

The researchers are supplementing the data that the CDC collects with various other sources. They’re taking information from Google Trends, statistics regarding how many people access the organization’s online resources pertaining to the flu, and Wikipedia access logs. They’re even starting to take tweets about the flu into account, as well as retail sales of flu medication.

However, some of these sources don’t always measure how many people are getting the flu. They might instead indicate the level of flu awareness.

“If there’s unusual news coverage of flu — maybe because a celebrity got the flu, or something — you would expect to see that influencing how many people search for flu on Wikipedia, or on Google,” said Rosenfeld. “But it would not influence how many people are hospitalized for flu.” The system is being refined so that fake peaks, like the surge of web searches described above, aren’t considered.

They’re even starting to take tweets about the flu into account.

In terms of forecasting, the team is using a combination of three methods that have been developed over the past few years, bringing together models of flu dynamics with time series analysis methodology that’s commonly used by economists.

The results speak for themselves. Information released by the CDC gave Delphi’s Epicast system a “skill score” of 0.451, and its Stat project scored 0.438 — where perfect predictions would have earned 1.00. For comparison, assumptions of what was going to happen based on a simple average of previous data would have only scored 0.237.

That score might not seem like much compared to an ideal of 1.00, but it’s easier to see the strength of the Delphi team’s work when its compared to that of other groups taking part in the initiative. Typically, when different systems are averaged together, they cover for one another’s weaknesses and score better. However, even when all 28 submissions were combined to create an ensemble forecast, the system could only score 0.430 – a hair below Delphi Stat on its own, and well below Delphi Epicast.

Trickle Down

For the purposes of the CDC’s initiative, the Delphi group is working with the organization’s needs in mind. Its primary interest in a new forecasting platform is its capacity to improve its ability to time its response to the flu season.

“Flu can be very deadly for older people.”

The CDC needs to make public announcements about the flu season, and commence its vaccination campaigns at just the right time. If they’re too early or too late, they’re not going to be as effective as they could be.

For now, the CDC is the “main driver” behind the project, according to Rosenfeld. Going forward, he says that he can see the platform being used at state and county levels. Hospitals could even use its forecasting capabilities to help determine what their staffing and equipment needs might be.

Rosenfeld is excited about the prospect of individuals being able to use the forecasts to inform their own behavior. “If you have a mother or a mother-in-law who is 90 years old and wants to go visit their sister in Cleveland, if you know that flu is going to peak in Cleveland two weeks from now, it would be useful to be able to advise her not to go,” he explained. “Because flu can be very deadly for older people.”

It’s important to note that the forecasting isn’t exact — you’re not going to be told, definitively, whether you will or will not contract the flu virus by stepping foot in Cleveland. Rosenfeld compares it a weather station’s precipitation reports, in that it offers a general idea of where it will rain, and how much, over the coming days and weeks.

The Delphi group is working on influenza forecasting because the need is imminent, and data is plentiful, but its platform is capable of much more. The team is already using its technology to look at dengue fever, which kills thousands of people every year, and there are plans to apply the same tools to diseases and conditions including HIV, Ebola, and Zika.

This is a field known as epidemiological forecasting — and it’s blossoming.

Under the Weather

To put the current state of epidemiological forecasting into context, Rosenfeld compares it to weather forecasting, which entered its infancy in the U.S. in the 1860s.

“At the time that it started, people didn’t realize how useful it would be economically and socially, and how much it could progress,” he said of weather forecasting’s early years. “It took many, many years — many, many decades — of development across multiple dimensions.”

flu forecasting sick woman drinking tea

Meteorologists had to put infrastructure in place to collect measurements and readings, first around the country, and then around the world. They had to develop new statistical models, and do other mathematical work to put this data to use. New technology was needed to analyze their findings. Weather forecasting was among the first applications for early supercomputers.

“If you compare that to epidemiological forecasting, we’re at the very beginning,” Rosenfeld said. “We do have the computing power, we have a head start in that regard. But we need to develop the theory, and we need to develop the measurements.”

Rosenfeld hopes that the research that’s being done as part of this CDC initiative will demonstrate the broader potential for epidemiological forecasting. “It will take quite a few years to grow, and a significant investment,” he acknowledged. “We’re trying to make the case for it. We’re trying to start the work and show the vital benefits of forecasting.”

Rosenberg and his team have no small task ahead of them. Just as the benefits of weather forecasting weren’t immediately obvious, it’s difficult to accrue the necessary infrastructure and theoretical frameworks without the proper backing.

Working with the CDC has helped the Delphia group make some major advances in terms of influenza. The next step is to look at more infectious diseases, and continue to improve upon the forecasting being done. With any luck, the results will help medical practitioners see the thunderhead of an outbreak before it occurs.


Lack of regulation means wearables aren’t held accountable for health claims

As fitness trackers become more like health monitors, some physicians are concerned they can lead to over-diagnosis of non-existent problems. It’s already happening with wearable baby monitors.

Alphabet’s health watch monitors your heart health, is approved by the FDA

A health monitoring watch being developed by Alphabet, Google's parent company, has received clearance from the FDA as a medical device. This means that the device has been found to be safe and can legally be sold in the U.S.

Lyft and Aptiv’s self-driving car program has come a long way (but not far enough)

Many companies talk about self-driving cars, but Lyft and Aptiv are already using a fleet of them to transport paying customers in Las Vegas. Hop in for a close look at the tech of autonomous cars, and the challenges they face.
Emerging Tech

Face-scanning A.I. can help doctors spot unusual genetic disorders

Facial recognition can unlock your phone. Could it also be used to identify whether a person has a rare genetic disorder, based on their facial features? New research suggests it can.

Watch out for these top-10 mistakes people make when buying a laptop

Buying a new laptop is exciting, but you need to watch your footing. There are a number of pitfalls you need to avoid and we're here to help. Check out these top-10 laptop buying mistakes and how to avoid them.

Don't spend a fortune on a PC. These are the best laptops under $300

Buying a laptop needn't mean spending a fortune. If you're just looking to browse the internet, answer emails, and watch Netflix, you can pick up a great laptop at a great price. These are the best laptops under $300.
Product Review

LG Gram 14 proves 2-in-1 laptops don’t need to sacrifice battery for light weight

The LG Gram 14 2-in-1 aims to be very light for a laptop that converts to a tablet. And it is. But it doesn’t skimp on the battery, and so it lasts a very long time on a charge.

Dell XPS 13 vs. Asus Zenbook 13: In battle of champions, who will be the victor?

The ZenBook 13 UX333 continues Asus's tradition of offering great budget-oriented 13-inch laptop offerings. Does this affordable machine offer enough value to compete with the excellent Dell XPS 13?

Take a trip to a new virtual world with one of these awesome HTC Vive games

So you’re considering an HTC Vive, but don't know which games to get? Our list of 25 of the best HTC Vive games will help you out, whether you're into rhythm-based gaming, interstellar dogfights, or something else entirely.

The Asus ZenBook 13 offers more value and performance than Apple's MacBook Air

The Asus ZenBook 13 UX333 is the latest in that company's excellent "budget" laptop line, and it looks and feels better than ever. How does it compare to Apple's latest MacBook Air?

AMD Radeon VII will support DLSS-like upscaling developed by Microsoft

AMD's Radeon VII has shown promise with early tests of an open DLSS-like technology developed by Microsoft called DirectML. It would provide similar upscale features, but none of the locks on hardware choice.

You could be gaming on AMD’s Navi graphics card before the end of the summer

If you're waiting for a new graphics card from AMD that doesn't cost $700, you may have to wait for Navi. But that card may not be far away, with new rumors suggesting we could see a July launch.

Is AMD's Navi back on track for 2019? Here's everything you need to know

With a reported launch in 2019, AMD is focusing on the mid-range market with its next-generation Navi GPU. Billed as a successor to Polaris, Navi promises to deliver better performance to consoles, like Sony's PlayStation 5.

Cortana wants to be friends with Alexa and Google Assistant

Microsoft no longer wants to compete against Amazon's Alexa and Google's Assistant in the digital assistant space. Instead, it wants to transform Cortana into a skill that can be integrated into other digital assistants.