Skip to main content

How the CDC could use Google, AI, and even Twitter to forecast flu outbreaks

vaccine blowing nose
Eugenio Marongiu/Getty Images
As summer gives way to fall, flu season is about to be upon us. Proper preparation is essential if there’s to be enough medical professionals and vaccinations to go around. The Centers for Disease Control and Prevention play a huge role in making sure practices and hospitals around the country know what to expect.

The CDC needs all the information that it can get to do this important work. Now, machine learning is bringing together a staggering amount of data — comprising everything from retail sales of flu medication to Google searches about symptoms — to create the best possible picture of the spread of the virus, as it happens. If it works, it could make predicting the spread of disease as commonplace as forecasting tomorrow’s thunderstorms.

Forecast face-off

Over the last four years, the CDC has run a forecasting research initiative intended to build better methods of predicting what flu season will bring.

Participants are invited to submit their own forecasting systems, which are judged stringently based on their accuracy. Each system needs to forecast when the season will start, when it’s going to peak, how bad it will be at its peak, and how bad it will be in one week’s time, two weeks’ time, three weeks’ time, or four weeks’ time.

The scope of this research goes well beyond the flu.

After that, participants are asked to submit a new forecast for each of these seven criteria every week through flu season, using new data that has been collected. Forecasts need to be made for each of ten regions comprising the U.S.

Once the flu season comes to an end, the forecasts are compared with the actual data that was collected. A total of 28 different systems were submitted to the CDC this year. Two of them were developed by Carnegie Mellon University’s Delphi research group, led by Roni Rosenfeld — and those two projects took both the number one and number two spots in the final ranking.

The CDC currently tracks the flu using a surveillance system. The key difference is that surveillance only looks at what’s happening right now, while forecasting can make a probabilistic statement about what’s going to happen in the future. The work being done by the Delphi group, among others, is poised to make a huge impact on the organization’s ability to plan for flu season – and the scope of this research goes well beyond the flu.

Sources of infection

There’s two main strands to the work the Delphi group is doing in conjunction with the CDC. The first is an improvement to the organization’s current surveillance techniques, which Rosenfeld refers to as ‘nowcasting.’ The aim is to make this data available in as close to real-time as possible, without sacrificing any accuracy.

“It takes a while to collate all these numbers, compile them, check them, and publish them,” Rosenfeld explained in a phone call with Digital Trends. “So as a result, when the CDC publishes their surveillance numbers online, they actually refer to the previous week, not the week that we’re in. So, they’re already between one and two weeks old.”

flu forecasting shot
Vladimir Gerdo/Getty Images
Vladimir Gerdo/Getty Images

The researchers are supplementing the data that the CDC collects with various other sources. They’re taking information from Google Trends, statistics regarding how many people access the organization’s online resources pertaining to the flu, and Wikipedia access logs. They’re even starting to take tweets about the flu into account, as well as retail sales of flu medication.

However, some of these sources don’t always measure how many people are getting the flu. They might instead indicate the level of flu awareness.

“If there’s unusual news coverage of flu — maybe because a celebrity got the flu, or something — you would expect to see that influencing how many people search for flu on Wikipedia, or on Google,” said Rosenfeld. “But it would not influence how many people are hospitalized for flu.” The system is being refined so that fake peaks, like the surge of web searches described above, aren’t considered.

They’re even starting to take tweets about the flu into account.

In terms of forecasting, the team is using a combination of three methods that have been developed over the past few years, bringing together models of flu dynamics with time series analysis methodology that’s commonly used by economists.

The results speak for themselves. Information released by the CDC gave Delphi’s Epicast system a “skill score” of 0.451, and its Stat project scored 0.438 — where perfect predictions would have earned 1.00. For comparison, assumptions of what was going to happen based on a simple average of previous data would have only scored 0.237.

That score might not seem like much compared to an ideal of 1.00, but it’s easier to see the strength of the Delphi team’s work when its compared to that of other groups taking part in the initiative. Typically, when different systems are averaged together, they cover for one another’s weaknesses and score better. However, even when all 28 submissions were combined to create an ensemble forecast, the system could only score 0.430 – a hair below Delphi Stat on its own, and well below Delphi Epicast.

Trickle Down

For the purposes of the CDC’s initiative, the Delphi group is working with the organization’s needs in mind. Its primary interest in a new forecasting platform is its capacity to improve its ability to time its response to the flu season.

“Flu can be very deadly for older people.”

The CDC needs to make public announcements about the flu season, and commence its vaccination campaigns at just the right time. If they’re too early or too late, they’re not going to be as effective as they could be.

For now, the CDC is the “main driver” behind the project, according to Rosenfeld. Going forward, he says that he can see the platform being used at state and county levels. Hospitals could even use its forecasting capabilities to help determine what their staffing and equipment needs might be.

Rosenfeld is excited about the prospect of individuals being able to use the forecasts to inform their own behavior. “If you have a mother or a mother-in-law who is 90 years old and wants to go visit their sister in Cleveland, if you know that flu is going to peak in Cleveland two weeks from now, it would be useful to be able to advise her not to go,” he explained. “Because flu can be very deadly for older people.”

It’s important to note that the forecasting isn’t exact — you’re not going to be told, definitively, whether you will or will not contract the flu virus by stepping foot in Cleveland. Rosenfeld compares it a weather station’s precipitation reports, in that it offers a general idea of where it will rain, and how much, over the coming days and weeks.

The Delphi group is working on influenza forecasting because the need is imminent, and data is plentiful, but its platform is capable of much more. The team is already using its technology to look at dengue fever, which kills thousands of people every year, and there are plans to apply the same tools to diseases and conditions including HIV, Ebola, and Zika.

This is a field known as epidemiological forecasting — and it’s blossoming.

Under the Weather

To put the current state of epidemiological forecasting into context, Rosenfeld compares it to weather forecasting, which entered its infancy in the U.S. in the 1860s.

“At the time that it started, people didn’t realize how useful it would be economically and socially, and how much it could progress,” he said of weather forecasting’s early years. “It took many, many years — many, many decades — of development across multiple dimensions.”

flu forecasting sick woman drinking tea
Image used with permission by copyright holder

Meteorologists had to put infrastructure in place to collect measurements and readings, first around the country, and then around the world. They had to develop new statistical models, and do other mathematical work to put this data to use. New technology was needed to analyze their findings. Weather forecasting was among the first applications for early supercomputers.

“If you compare that to epidemiological forecasting, we’re at the very beginning,” Rosenfeld said. “We do have the computing power, we have a head start in that regard. But we need to develop the theory, and we need to develop the measurements.”

Rosenfeld hopes that the research that’s being done as part of this CDC initiative will demonstrate the broader potential for epidemiological forecasting. “It will take quite a few years to grow, and a significant investment,” he acknowledged. “We’re trying to make the case for it. We’re trying to start the work and show the vital benefits of forecasting.”

Rosenberg and his team have no small task ahead of them. Just as the benefits of weather forecasting weren’t immediately obvious, it’s difficult to accrue the necessary infrastructure and theoretical frameworks without the proper backing.

Working with the CDC has helped the Delphia group make some major advances in terms of influenza. The next step is to look at more infectious diseases, and continue to improve upon the forecasting being done. With any luck, the results will help medical practitioners see the thunderhead of an outbreak before it occurs.

Editors' Recommendations

Brad Jones
Former Digital Trends Contributor
Brad is an English-born writer currently splitting his time between Edinburgh and Pennsylvania. You can find him on Twitter…
Best Surface Laptop and Surface Pro deals: From $450
Microsoft Surface Go 3 sitting on table.

If you want a thin and light laptop that's similar to the MacBook Air but not in the Apple ecosystem, then the Microsoft Surface lineup of laptops is absolutely the way to go. In fact, if you've seen the recent unwrapping of the business version of the Surface Pro 10 and Surface Laptop 6, you might be fired up and ready to grab your own surface. Unfortunately, the Surface lineup can be quite expensive, which is why we've gone out and scoured the retailers for the best deals we could find and collected them below. So, be sure to check out everything, as well as some of these other great laptop deals if you aren't fully committed to the Microsoft Surface lineup.
Microsoft Surface Go 3 -- $450, was $550

Functioning as a 2-in-1 laptop that can switch between tablet mode and laptop mode, the Microsoft Surface Go 3 won't have trouble dealing with basic tasks as it's equipped with the Intel Pentium Gold 6500Y processor and 8GB of RAM. The 10.5-inch touchscreen with 1920 x 1080 resolution is bright and colorful, and its 128GB SSD is more than enough for your documents. The Microsoft Surface Go 3 ships with Windows 11 Home in Mode, so you can start using it as soon as you unbox it. The device also promises up to 11 hours of battery life before requiring a recharge.

Read more
Best GPU deals: MSI, XFX, EVGA
An AMD graphics card in an external GPU enclosure.

If you're building a new PC from scratch, or upgrading an old one, then a new GPU is probably one of the biggest upgrades you can make, at least if you're looking for great gaming performance. Unfortunately, the last generation of RTX 40-series cards really amped the prices up, and even if you're going for AMD, you're going to be paying a pretty penny to get your hands on a good GPU. That said, there are some good deals to be had; whether you want something budget-friendly or high-end, you can always put that extra money you save into more RAM or a better CPU. Also, be sure to check out some of these gaming PC deals if you'd rather just grab something already pre-built.
MSI AMD Radeon RX 6500 XT Mech 2X 4GB GDDR6 -- $175, was $190

If you're looking for something that is ultra-budget, then this RX 6500 XT is a good option in the lower range and should let you handle at least some of the main free-to-play games like CS:GO and Rocket League, although you will have to play with graphical compromises. It should also handle indie and casual games, especially older ones like the ones you might find on emulators, so it's also a good option for that sort of budget build. the 4GB of VRAM is not a lot, but again, if you're not planning to play any modern AAA or AA games, then this isn't a bad option.

Read more
Horizon Forbidden West is a marvel — if your PC can handle the heat
Aloy shooting a bow in Horizon Forbidden West.

More than two years after its release on PS5, Horizon Forbidden West is now available on PC. The original game, Horizon Zero Dawn, has become a mainstay for performance testing on PC, and it's one of the pillars of our GPU reviews. The sequel ups the ante in a big way with more graphics options and a more demanding world overall.

I've been playing the game over the past week, drilling down on the best settings, comparing DLSS, FSR, and XeSS, and testing the bounds of performance. Horizon Forbidden West lives up to the standard set by the original release, though weaker GPUs with only 8GB of memory will struggle with high graphics settings and resolutions.
Best settings for Horizon Forbidden West PC

Read more