Skip to main content

Facebook uses AI to help the blind ‘see’ images

Facebook is ready to help blind people ‘see’ images via artificial intelligence. The new feature, called automatic alternate text, works with existing screen reader apps used by blind and visually impaired people. The AI-generated descriptions identify objects and scenes but there is no facial recognition –although we can imagine it’s on the way. So you if you share an image with A visually impaired friend it won’t tell him or her who is in the picture or what everyone is wearing, but it might read “Image may contain: three people, smiling, birthday cake.”

There are more than 246 million people around the world with severe visual impairments and 39 million who are blind, according to Facebook. More than 2 billion photos are shared daily on Facebook, Instagram, Messenger, and WhatsApp. Automatic alternate text can help social media be more inclusive.

Recommended Videos

Facebook’s automatic alternate text introduction follows Twitter’s announcement last week of a 420 image character description field also intended for visually impaired people who use screen readers with mobile devices. With the Twitter app the person who composes the Tweet also writes the description. The Facebook feature automatically attempts to describe the image, with the disclaimer “Image may contain.” Of course, a Facebook post creator has plenty of space to describe images already, while Twitter limits regular text to just 140 characters. In each case blind and visually impaired people get less of a raw deal.

Facebook’s automatic alternate text feature is available now for people who use iOS devices in English in the U.S., U.K., Canada, Australia, and New Zealand. The company plans to support more platforms, languages, and market in the near future.

The accuracy of Facebook’s automatic alternate text feature matters and will likely improve over time.

We imagine that facial recognition is already on the planning board. On the other hand, perhaps it’s better not to attempt to identify people in photos Until the tech is flawless. Imagine if a Facebook screen reader misidentified and called out the wrong names. In some circumstances that could be pretty embarrassing.

Bruce Brown
Bruce Brown Contributing Editor   As a Contributing Editor to the Auto teams at Digital Trends and TheManual.com, Bruce…
Having used the OnePlus 13s, this is what Apple needs to pay attention to
having used the oneplus 13s this is what apple needs to pay attention

The idea of a truly helpful digital assistant has caught more steam ever since products like ChatGPT landed on the scene. Google’s Gemini has inched pretty close to the dream, finding a spot in all the software lanes that a person visits on a daily basis. From Gmail to Maps, it’s now everywhere.

On the flip side, all interactions with an assistant aren’t as convenient as one might expect. AI errors are still a problem, and the contextual memory often goes haywire, as well. Plus, some of the most advanced capabilities, such as Project Mariner or ChatGPT operator, are either limited or come at a steep premium.  

Read more
Mountainhead creator says he ‘scraped AI companies back’ to make his movie
A group of four men pose for a photo on a mountain.

Mountainhead writer and director Jesse Armstrong has said he’s “pretty sure that the AI companies have been scraping my material along with everyone else’s to train their large language models,” and that to find the right voices for the movie’s tech-bro characters, “I’ve been scraping them back.” 

Mountainhead, which lands on HBO this weekend, is a dark satire about a group of tech billionaires who retreat to a secluded mountain lodge during a global crisis --  a crisis that’s exacerbated by their own creations, including highly convincing AI-generated deepfakes and a social media platform that fuels misinformation and instability.

Read more
Can AI really replace your keyboard and mouse?
Gemini Live on a phone atop a keyboard.

“Hey ChatGPT, left-click on the enter password field in the pop-up window appearing in the lower left quadrant of the screen and fill XUS&(#($J, and press Enter.” 

Fun, eh? No, thanks. I'll just move my cheap mouse and type the 12 characters on my needlessly clicky keyboard, instead of speaking the password out loud in my co-working space.

Read more