Artificial intelligence is getting better at identifying objects on screen, but challenges still remain.
Whether it’s HAL 9000 in Stanley Kubrick’s 2001: A Space Odyssey or the computer that learned to recognize cats by watching YouTube videos, artificial intelligence has a long and storied history of crossing over with movies.
A new project, created by Copenhagen-based creative coding studio Støj, represents the next step of that process with an AI designed to watch — and try and make sense of — Hollywood movie trailers.
“We are two interaction designers with a big interest in machine learning and computer vision,” creators Lasse Korsgaard and Andreas Refsgaard told Digital Trends. “Real time object detection has the potential to become a really powerful tool in the type of work we do, and to test out its capabilities we thought it would be fun to run some movies through the algorithms.”
The main component of the project involves YOLO-2, a system for real-time object detection that is able to recognize everyday objects like persons, ties, cars, and chairs as they appear. Short for “You Only Look Once,” YOLO is extremely fast, with a degree of accuracy, and the ability to detect and classify multiple objects in the same image.
“We chose movie trailers [for our project] because they generally are fast-paced with lots of cuts between scenes, and therefore typically containing a wide variety of objects for potential detection,” Korsgaard and Refsgaard continued. “Also it was interesting to see the trailers through the lens of an object detection algorithm, and show people a few takes on what movie trailers actually look like in the logic of such a system.”
As a creative project, it’s pretty darn fascinating to watch — although it highlights some of the challenges that remain with current object classification systems. For example, in the above trailer for The Wolf of Wall Street, the algorithm mistook cast member Jon Bernthal’s goatee for a cellphone — an error no human would likely make.
To heighten this surreal effect, Korsgaard and Refsgaard said they purposely lowered the threshold of certainty so that the algorithm would be more likely to make guesses — incorrect or otherwise.
Next up, he said that Støj hopes to make available an interactive version of the project, where users can take control of the system to possibly use their own (or their own choice of) trailers. “We also plan on using image recognition algorithms in interactive installation work, where people and objects in physical space are detected and projections or soundscapes react accordingly,” Korsgaard and Refsgaard concluded.