After watching 600 hours of TV, this AI program can predict when people will hug or high five

Turns out, shows like “The Office” and “Desperate Housewives” are worth more than their entertainment value. The scripted acts of Steve Carrell, Mindy Kaling, and Eva Longoria have helped train an artificial intelligence to predict human interactions at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).

Social intuition is a relatively sacred aspect of being human. What might come to us with ease proves difficult for computers. We’re pretty good at judging whether two other people will hug or high-five upon meeting each other, and hardly give this any thought. Body language, age, gender, dress, are cues enough to make the decision just sort of come to us.

For computers, it’s not so easy. Asking a computer to intuit a greeting is like asking the average person to perform long division in his head. But by showing a machine vision system 600 hours of TV shows and YouTube videos, the algorithm was able to determine whether an ensuing greeting would be a hug, handshake, kiss, or high-five.

“We are starting to be able to teach computers to pick up on subtle interpersonal cues, like the fact that two people whose heads are close together might kiss, or that an extended hand often indicates a handshake.” CSAIL PhD student and study co-author, Carl Vondrick, told Digital Trends. “These are the sorts of things that are obvious to a human, but not common-sense for a computer yet. We hope that this kind of result will help push the work forward in this area.”

Action-Prediction Algorithms

The system isn’t perfect though. After its immense training session, the computer was shown videos of people in the second before they greeted each other and it could accurately predict their action about 43 percent of the time. That’s hardly the 71 percent accuracy achieved by human subjects, but bests an existing algorithm’s accuracy of 36 percent.

Practice makes perfect though, so the algorithm’s training isn’t finished yet. Indeed, the researchers hope to throw a lot more video at it soon. “I’m excited to see how much better the algorithms get if we can feed them a lifetime’s worth of videos,” Vondrick said in a press release. He also noted some of the system’s real-world applications which range from humanizing robots to predicting car crashes.

Editors' Recommendations

Optical illusions could help us build the next generation of AI