Skip to main content

Facebook opens up its image-recognition AI software to everyone

facebook messenger virus malware windows chrome facebookcomp head
Image used with permission by copyright holder
The AI research division at Facebook is open sourcing its image recognition software with the aim of advancing the tech so it can one day be applied to live video. Facebook’s DeepMask, SharpMask, and MultiPathNet software is now available to everyone on GitHub.

Facebook previously laid out its image-recognition systems in a number of research papers, which are also being made available to the public along with its demos. At present, the company’s algorithms work in conjunction with its MultiPathNet convolutional neural networks — an AI that is fed huge amounts of data until it can autonomously recognize other data — allowing Facebook to understand an image based on each pixel it contains.

In order to classify and label the objects in an image, Facebook couples its DeepMask segmentation framework with its SharpMask segment refinement module. The final stage in Facebook’s machine vision system utilizes its MultiPathNet deep learning AI to label each object in the photo.

According to Facebook, AI machine vision software has progressed in leaps and bounds over the past few years, allowing the type of image classification that didn’t even exist a short while ago. Facebook claims that open sourcing the software is critical to its advancement.

Example images scanned by Facebook's complete image-recognition system
Example images scanned by Facebook’s complete image-recognition system Image used with permission by copyright holder

Deep learning techniques are springing up all over the big blue behemoth. The AI powers Facebook’s (controversial) facial-recognition feature, manages curation on its News Feed, and is even utilized within its digital assistant for Messenger.

This isn’t the first time Facebook has open sourced its AI. In fact, the company is somewhat of a trailblazer when it comes to sharing its tech. In December, Facebook submitted its state-of-the-art computer server dedicated to AI to the Open Compute Project — a group consisting of tech giants, such as Apple and Microsoft, that share the designs of their respective computer infrastructures.

Facebook is already predicting the future use cases for the image-recognition tech. The company reveals that it could potentially help it to build upon its existing AI generated image descriptions for the visually impaired.

“Currently, visually impaired users browsing photos on Facebook only hear the name of the person who shared the photo, followed by the term “photo,” when they come upon an image in their News Feed,” writes Piotr Dollar, research scientist at Facebook AI Research (FAIR), in a blog post. “Instead we aim to offer richer descriptions, such as ‘Photo contains beach, trees, and three smiling people.’”

Additionally, Facebook claims that its next challenge is to apply its image-recognition techniques to video, “where objects are moving, interacting, and changing over time,” and even Facebook Live broadcasts. “Real-time classification could help surface relevant and important Live videos on Facebook, while applying more refined techniques to detect scenes, objects, and actions over space and time could one day allow for real-time narration,” Dollar adds.

Editors' Recommendations

Saqib Shah
Former Digital Trends Contributor
Saqib Shah is a Twitter addict and film fan with an obsessive interest in pop culture trends. In his spare time he can be…
The BigSleep A.I. is like Google Image Search for pictures that don’t exist yet

In case you’re wondering, the picture above is "an intricate drawing of eternity." But it’s not the work of a human artist; it’s the creation of BigSleep, the latest amazing example of generative artificial intelligence (A.I.) in action.

A bit like a visual version of text-generating A.I. model GPT-3, BigSleep is capable of taking any text prompt and visualizing an image to fit the words. That could be something esoteric like eternity, or it could be a bowl of cherries, or a beautiful house (the latter of which can be seen below.) Think of it like a Google Images search -- only for pictures that have never previously existed.
How BigSleep works
“At a high level, BigSleep works by combining two neural networks: BigGAN and CLIP,” Ryan Murdock, BigSleep’s 23-year-old creator, a student studying cognitive neuroscience at the University of Utah, told Digital Trends.

Read more
OpenAI’s GPT-3 algorithm is here, and it’s freakishly good at sounding human
GPT-2 AI Text Generator

When the text-generating algorithm GPT-2 was created in 2019, it was labeled as one of the most “dangerous” A.I. algorithms in history. In fact, some argued that it was so dangerous that it should never be released to the public (spoiler: It was) lest it ushers in the “robot apocalypse." That, of course, never happened. GPT-2 was eventually released to the public, and after it didn't destroy the world, its creators moved on to the next thing. But how do you follow up the most dangerous algorithm ever created?

The answer, at least on paper, is simple: Just like the sequel to any successful movie, you make something that’s bigger, badder, and more expensive. Only one xenomorph in the first Alien? Include a whole nest of them in the sequel, Aliens. Just a single nigh-indestructible machine sent back from the future in Terminator? Give audiences two of them to grapple with in Terminator 2: Judgment Day.

Read more
Clearview AI’s client list was stolen. Could its massive face database be next?
collage of facial recognition faces

The controversial artificial intelligence company Clearview AI has experienced a breach that saw the theft of its entire customer list, which is made up of various law enforcement agencies. 

An intruder was able to gain “unauthorized access” to Clearview’s full customer list, the Daily Beast first reported on Wednesday. Clearview AI says that it currently only has contracts with law enforcement agencies and select security professionals. 

Read more