Skip to main content

MIT algorithm can predict the (immediate) future from still images

Creating Videos of the Future
Humans still can’t predict elections but we’re pretty good at predicting the immediate future. Baby drops glass cup, cup falls and shatters, and baby starts to cry. We’re so good at these short-term forecasts that we can often even describe what events will happen next in an image.

But what’s second nature for us can prove complicated for computers. Will the glass break or bounce? Will the baby laugh or cry?

A team of researchers from the Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a system that can predict the following events in images and generate videos to depict them. The system needs work — its current productions are simple, short, and unassuming — but it stands out for its unique approach and accuracy.

“Instead of building up scenes frame by frame, we focus on processing the entire scene at once,” Carl Vondrick, PhD at MIT CSAIL and lead author of the paper, told Digital Trends.
video-examples-with-input-and-output

Alternative computer vision models that attempt the same task use recurrent networks to generate predictive videos on a frame-by-frame basis. The system developed by Vondrick and his team uses “convolutional networks” to generate all 32 frames simultaneously.

“The existing approach of going frame by frame has a certain logic,” Vondrick said, “but it also creates a massive margin for error. It’s sort of like a big game of ‘Telephone,” which means that the message most likely will fall apart by the time you go around the whole room.

“In contrast, our approach is the ‘Telephone’ equivalent of speaking to everyone in the room at once,” he added.

The researchers trained the system on a year of footage packed into two million videos and — in order to generate all frames at once — taught it distinguish foregrounds from backgrounds, and mobile objects from stationary ones. They then showed the system still images and had it generate short clips of subsequent events.

Once the system could generate video clips, Vondrick and his team set out to refine it through a method called adversarial learning.

“The idea behind adversarial learning is to have two neural networks compete against each other,” Vondrick said. “One network tries to decide what is real versus fake, and another tries to generate something that fools the first network.”

Through this computer competition the generative algorithm improved the accuracy of its video clips until it was able to fool human subjects 20 percent more often than a baseline model, according to a paper that will be presented next week at the Neural Information Processing Systems conference in Barcelona.

But with accuracy comes complexity and with complexity comes obstacles.

The current system’s videos are short — a mere one and a half seconds long. If the clips were much longer than that, they’d risk their consistency. “The key challenge is being able to reliably track the relationships between all of the objects in a scene … to make sure that the video that’s being generated still makes sense five or ten seconds later,” Vondrick said. To develop accurate and long videos, the system may need human input to help it grasp context and connection between seemingly unrelated actions, such as jogging and showering.

Vondrick’s ambitious end goal is to develop an algorithm that can create believable feature-length films, though he admits that is still some years off. In the near term though he thinks this system could refine AI systems by helping them adapt to unpredictable environments.

Editors' Recommendations

Dyllan Furness
Dyllan Furness is a freelance writer from Florida. He covers strange science and emerging tech for Digital Trends, focusing…
This Alienware gaming PC with an RTX 4090, 64GB of RAM is $1,000 off
Alienware Aurora R15 placed at an angle on a table.

Dell is consistently a great place to check for gaming PC deals and that’s certainly the case today. If you want a high-end gaming rig for less, you can currently buy the Alienware Aurora R15 gaming desktop for $2,900 instead of $3,900. The $1,000 saving is particularly sweet when you bear in mind that this is a truly high-end gaming PC packed with all the latest hardware. If you’re keen to know more, check out what we have to say about it below or you can simply hit the button below to go straight to the deal.

Why you should buy the Alienware Aurora R15 gaming desktop
Alienware makes some of the best gaming PCs around and the Alienware Aurora R15 gaming desktop is a perfect representation of that. It’s packed with the latest hardware. That includes an AMD Ryzen 9 7900X processor, 64GB of memory and 2TB of M.2 SSD storage. It’s great to see so much RAM with many gaming PCs still sticking with 32GB when 64GB really does set you up for the long term. Similarly, the large amount of fast storage is perfect for ensuring you won’t run out of room any time soon even when handling large installs like Call of Duty: Warzone or Hogwarts Legacy.

Read more
4 CPUs you should buy instead of the Ryzen 7 7800X3D
AMD Ryzen 7 7800X3D sitting on a motherboard.

The Ryzen 7 7800X3D is one of the best gaming processors you can buy, and it's easy to see why. It's easily the fastest gaming CPU on the market, it's reasonably priced, and it's available on a platform that AMD says it will support for several years. But it's not the right chip for everyone.

Although the Ryzen 7 7800X3D ticks all the right boxes, there are several alternatives available. Some are cheaper while still offering great performance, while others are more powerful in applications outside of gaming. The Ryzen 7 7800X3D is a great CPU, but if you want to do a little more shopping, these are the other processors you should consider.
AMD Ryzen 7 5800X3D

Read more
Even the new mid-tier Snapdragon X Plus beats Apple’s M3
A photo of the Snapdragon X Plus CPU in the die

You might have already heard of the Snapdragon X Elite, the upcoming chips from Qualcomm that everyone's excited about. They're not out yet, but Qualcomm is already announcing another configuration to live alongside it: the Snapdragon X Plus.

The Snapdragon X Plus is pretty similar to the flagship Snapdragon X Elite in terms of everyday performance but, as a new chip tier, aims to bring AI capabilities to a wider portfolio of ARM-powered laptops. To be clear, though, this one is a step down from the flagship Snapdragon X Elite, in the same way that an Intel Core Ultra 7 is a step down from Core Ultra 9.

Read more