Skip to main content

How the CDC could use Google, AI, and even Twitter to forecast flu outbreaks

vaccine blowing nose
Eugenio Marongiu/Getty Images
As summer gives way to fall, flu season is about to be upon us. Proper preparation is essential if there’s to be enough medical professionals and vaccinations to go around. The Centers for Disease Control and Prevention play a huge role in making sure practices and hospitals around the country know what to expect.

The CDC needs all the information that it can get to do this important work. Now, machine learning is bringing together a staggering amount of data — comprising everything from retail sales of flu medication to Google searches about symptoms — to create the best possible picture of the spread of the virus, as it happens. If it works, it could make predicting the spread of disease as commonplace as forecasting tomorrow’s thunderstorms.

Forecast face-off

Over the last four years, the CDC has run a forecasting research initiative intended to build better methods of predicting what flu season will bring.

Participants are invited to submit their own forecasting systems, which are judged stringently based on their accuracy. Each system needs to forecast when the season will start, when it’s going to peak, how bad it will be at its peak, and how bad it will be in one week’s time, two weeks’ time, three weeks’ time, or four weeks’ time.

The scope of this research goes well beyond the flu.

After that, participants are asked to submit a new forecast for each of these seven criteria every week through flu season, using new data that has been collected. Forecasts need to be made for each of ten regions comprising the U.S.

Once the flu season comes to an end, the forecasts are compared with the actual data that was collected. A total of 28 different systems were submitted to the CDC this year. Two of them were developed by Carnegie Mellon University’s Delphi research group, led by Roni Rosenfeld — and those two projects took both the number one and number two spots in the final ranking.

The CDC currently tracks the flu using a surveillance system. The key difference is that surveillance only looks at what’s happening right now, while forecasting can make a probabilistic statement about what’s going to happen in the future. The work being done by the Delphi group, among others, is poised to make a huge impact on the organization’s ability to plan for flu season – and the scope of this research goes well beyond the flu.

Sources of infection

There’s two main strands to the work the Delphi group is doing in conjunction with the CDC. The first is an improvement to the organization’s current surveillance techniques, which Rosenfeld refers to as ‘nowcasting.’ The aim is to make this data available in as close to real-time as possible, without sacrificing any accuracy.

“It takes a while to collate all these numbers, compile them, check them, and publish them,” Rosenfeld explained in a phone call with Digital Trends. “So as a result, when the CDC publishes their surveillance numbers online, they actually refer to the previous week, not the week that we’re in. So, they’re already between one and two weeks old.”

flu forecasting shot
Vladimir Gerdo/Getty Images
Vladimir Gerdo/Getty Images

The researchers are supplementing the data that the CDC collects with various other sources. They’re taking information from Google Trends, statistics regarding how many people access the organization’s online resources pertaining to the flu, and Wikipedia access logs. They’re even starting to take tweets about the flu into account, as well as retail sales of flu medication.

However, some of these sources don’t always measure how many people are getting the flu. They might instead indicate the level of flu awareness.

“If there’s unusual news coverage of flu — maybe because a celebrity got the flu, or something — you would expect to see that influencing how many people search for flu on Wikipedia, or on Google,” said Rosenfeld. “But it would not influence how many people are hospitalized for flu.” The system is being refined so that fake peaks, like the surge of web searches described above, aren’t considered.

They’re even starting to take tweets about the flu into account.

In terms of forecasting, the team is using a combination of three methods that have been developed over the past few years, bringing together models of flu dynamics with time series analysis methodology that’s commonly used by economists.

The results speak for themselves. Information released by the CDC gave Delphi’s Epicast system a “skill score” of 0.451, and its Stat project scored 0.438 — where perfect predictions would have earned 1.00. For comparison, assumptions of what was going to happen based on a simple average of previous data would have only scored 0.237.

That score might not seem like much compared to an ideal of 1.00, but it’s easier to see the strength of the Delphi team’s work when its compared to that of other groups taking part in the initiative. Typically, when different systems are averaged together, they cover for one another’s weaknesses and score better. However, even when all 28 submissions were combined to create an ensemble forecast, the system could only score 0.430 – a hair below Delphi Stat on its own, and well below Delphi Epicast.

Trickle Down

For the purposes of the CDC’s initiative, the Delphi group is working with the organization’s needs in mind. Its primary interest in a new forecasting platform is its capacity to improve its ability to time its response to the flu season.

“Flu can be very deadly for older people.”

The CDC needs to make public announcements about the flu season, and commence its vaccination campaigns at just the right time. If they’re too early or too late, they’re not going to be as effective as they could be.

For now, the CDC is the “main driver” behind the project, according to Rosenfeld. Going forward, he says that he can see the platform being used at state and county levels. Hospitals could even use its forecasting capabilities to help determine what their staffing and equipment needs might be.

Rosenfeld is excited about the prospect of individuals being able to use the forecasts to inform their own behavior. “If you have a mother or a mother-in-law who is 90 years old and wants to go visit their sister in Cleveland, if you know that flu is going to peak in Cleveland two weeks from now, it would be useful to be able to advise her not to go,” he explained. “Because flu can be very deadly for older people.”

It’s important to note that the forecasting isn’t exact — you’re not going to be told, definitively, whether you will or will not contract the flu virus by stepping foot in Cleveland. Rosenfeld compares it a weather station’s precipitation reports, in that it offers a general idea of where it will rain, and how much, over the coming days and weeks.

The Delphi group is working on influenza forecasting because the need is imminent, and data is plentiful, but its platform is capable of much more. The team is already using its technology to look at dengue fever, which kills thousands of people every year, and there are plans to apply the same tools to diseases and conditions including HIV, Ebola, and Zika.

This is a field known as epidemiological forecasting — and it’s blossoming.

Under the Weather

To put the current state of epidemiological forecasting into context, Rosenfeld compares it to weather forecasting, which entered its infancy in the U.S. in the 1860s.

“At the time that it started, people didn’t realize how useful it would be economically and socially, and how much it could progress,” he said of weather forecasting’s early years. “It took many, many years — many, many decades — of development across multiple dimensions.”

flu forecasting sick woman drinking tea
Image used with permission by copyright holder

Meteorologists had to put infrastructure in place to collect measurements and readings, first around the country, and then around the world. They had to develop new statistical models, and do other mathematical work to put this data to use. New technology was needed to analyze their findings. Weather forecasting was among the first applications for early supercomputers.

“If you compare that to epidemiological forecasting, we’re at the very beginning,” Rosenfeld said. “We do have the computing power, we have a head start in that regard. But we need to develop the theory, and we need to develop the measurements.”

Rosenfeld hopes that the research that’s being done as part of this CDC initiative will demonstrate the broader potential for epidemiological forecasting. “It will take quite a few years to grow, and a significant investment,” he acknowledged. “We’re trying to make the case for it. We’re trying to start the work and show the vital benefits of forecasting.”

Rosenberg and his team have no small task ahead of them. Just as the benefits of weather forecasting weren’t immediately obvious, it’s difficult to accrue the necessary infrastructure and theoretical frameworks without the proper backing.

Working with the CDC has helped the Delphia group make some major advances in terms of influenza. The next step is to look at more infectious diseases, and continue to improve upon the forecasting being done. With any luck, the results will help medical practitioners see the thunderhead of an outbreak before it occurs.

Editors' Recommendations

Brad Jones
Former Digital Trends Contributor
Brad is an English-born writer currently splitting his time between Edinburgh and Pennsylvania. You can find him on Twitter…
The best tablets in 2024: top 11 tablets you can buy now
Disney+ app on the iPad Air 5.

As much as we love having the best smartphones in our pockets, there are times when those small screens don't cut it and we just need a larger display. That's when you turn to a tablet, which is great for being productive on the go and can be a awesome way to unwind and relax too. While the tablet market really took off after the iPad, it has grown to be quite diverse with a huge variety of products — from great budget options to powerhouses for professionals.

We've tried out a lot of tablets here at Digital Trends, from the workhorses for pros to tablets that are made for kids and even seniors -- there's a tablet for every person and every budget. For most people, though, we think Apple's iPad Air is the best overall tablet — especially if you're already invested in the Apple ecosystem. But if you're not an Apple user, that's fine too; there are plenty of other great options that you'll find in this roundup.

Read more
How to delete a file from Google Drive on desktop and mobile
Google Drive in Chrome on a MacBook.

Google Drive is an excellent cloud storage solution that can be accessed from numerous devices. Whether you do most of your Google Drive uploading or downloading from a PC, Chromebook, or mobile device, there’s going to come a time when you’ll need to delete a file (or two). Fortunately, the deletion process couldn’t be more straightforward. We’ve also put together this helpful guide to show you how to trash your Drive content a couple of different ways.

Read more
Windows 11 might nag you about AI requirements soon
Copilot on a laptop on a desk.

After recent reports of new hardware requirements for the upcoming Windows 11 24H2 update, it is evident that Microsoft is gearing up to introduce a bunch of new AI features. A new report now suggests that the company is working on adding new code to the operating system to alert users if they fail to match the minimum requirements to run AI-based applications.

According to Albacore on X (formerly known as Twitter), systems that do not meet the requirements will display a warning message in the form of a watermark. After digging into the latest Windows 11 Insider Build 26200, he came across requirements coded in the operating system for an upcoming AI File Explorer feature. The minimum requirement includes an ARM64 processor, 16GB of memory, 225GB of total storage, and a Qualcomm Snapdragon X Elite NPU.

Read more