Skip to main content

MIT’s latest A.I. is freakishly good at determining what’s going on in videos

How a Temporal Relation Network understands what's going on there

Just a few frames of information telling a story are all we need to understand what is going on. This is, after all, the basis for comic books — which provide just enough of the important story beats for us to follow what has happened. Sadly, robots equipped with computer vision technology struggle to do this. Until now, at least.

Recently, the Massachusetts Institute of Technology (MIT) demonstrated a new type of artificial intelligence system which uses a neural network to fill in the blanks in video frames to work out what activity is taking place. The results make it astonishingly good at determining what is taking place in a video.

“The newly developed temporal relation modules enable the A.I. system to analyze a few key frames and estimate the temporal relation among them, in order to understand what’s going on in the video — such as a stack of objects [being] knocked down,” Bolei Zhou, a former Ph.D. student in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), who is now an assistant professor of computer science at the Chinese University of Hong Kong, told Digital Trends. “Because the model works with key frames sparsely sampled from the incoming video, the processing efficiency is greatly improved, enabling real-time activity recognition.”

Another exciting property of the A.I. model is that it can anticipate and forecast what will happen early on by viewing frames of video. For instance, if it sees a person holding a bottle, the algorithm anticipates that they might take a drink or possibly squeeze it. Such anticipation abilities will be essential for artificial intelligence used in domains like autonomous driving, where it could proactively prevent accidents by guessing what will happen from moment to moment.

“It [could also] be used to monitor human behaviors, such as a home robot assistant which could anticipate your intention by delivering things beforehand,” Zhou continued. “It [could additionally be employed] to analyze the massive [number of] videos online, to do better video understanding and video retrieval.”

The next step of the project will involve increasing the A.I.’s ability to recognize a broader number of objects and activities. The team is also working with robotics researchers to deploy this activity recognition into robot systems. These could see enhanced perception and visual reasoning skills as a result.

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Can A.I. beat human engineers at designing microchips? Google thinks so
google artificial intelligence designs microchips photo 1494083306499 e22e4a457632

Could artificial intelligence be better at designing chips than human experts? A group of researchers from Google's Brain Team attempted to answer this question and came back with interesting findings. It turns out that a well-trained A.I. is capable of designing computer microchips -- and with great results. So great, in fact, that Google's next generation of A.I. computer systems will include microchips created with the help of this experiment.

Azalia Mirhoseini, one of the computer scientists of Google Research's Brain Team, explained the approach in an issue of Nature together with several colleagues. Artificial intelligence usually has an easy time beating a human mind when it comes to games such as chess. Some might say that A.I. can't think like a human, but in the case of microchips, this proved to be the key to finding some out-of-the-box solutions.

Read more
Read the eerily beautiful ‘synthetic scripture’ of an A.I. that thinks it’s God
ai religion bot gpt 2 art 4

Travis DeShazo is, to paraphrase Cake’s 2001 song “Comfort Eagle,” building a religion. He is building it bigger. He is increasing the parameters. And adding more data.

The results are fairly convincing, too, at least as far as synthetic scripture (his words) goes. “Not a god of the void or of chaos, but a god of wisdom,” reads one message, posted on the @gods_txt Twitter feed for GPT-2 Religion A.I. “This is the knowledge of divinity that I, the Supreme Being, impart to you. When a man learns this, he attains what the rest of mankind has not, and becomes a true god. Obedience to Me! Obey!”

Read more
Google’s LaMDA is a smart language A.I. for better understanding conversation
LaMDA model

Artificial intelligence has made extraordinary advances when it comes to understanding words and even being able to translate them into other languages. Google has helped pave the way here with amazing tools like Google Translate and, recently, with its development of Transformer machine learning models. But language is tricky -- and there’s still plenty more work to be done to build A.I. that truly understands us.
Language Model for Dialogue Applications
At Tuesday’s Google I/O, the search giant announced a significant advance in this area with a new language model it calls LaMDA. Short for Language Model for Dialogue Applications, it’s a sophisticated A.I. language tool that Google claims is superior when it comes to understanding context in conversation. As Google CEO Sundar Pichai noted, this might be intelligently parsing an exchange like “What’s the weather today?” “It’s starting to feel like summer. I might eat lunch outside.” That makes perfect sense as a human dialogue, but would befuddle many A.I. systems looking for more literal answers.

LaMDA has superior knowledge of learned concepts which it’s able to synthesize from its training data. Pichai noted that responses never follow the same path twice, so conversations feel less scripted and more responsively natural.

Read more