Skip to main content

ChatGPT already listens and speaks. Soon it may see as well

ChatGPT meets a dog
OpenAI

ChatGPT’s Advanced Voice Mode, which allows users to converse with the chatbot in real time, could soon gain the gift of sight, according to code discovered in the platform’s latest beta build. While OpenAI has not yet confirmed the specific release of the new feature, code in the ChatGPT v1.2024.317 beta build spotted by Android Authority suggests that the so-called “live camera” could be imminently forthcoming.

Recommended Videos

OpenAI had first shown off Advanced Voice Mode’s vision capabilities for ChatGPT in May, when the feature was first launched in alpha. During a demo posted at the time, the system was able to identify that it was looking at a dog through the phone’s camera feed, identify the dog based on past interactions, recognize the dog’s ball, and associate the dog’s relationship to the ball (i.e. playing fetch).

Dog meets GPT-4o

The feature was an immediate hit with alpha testers as well. X user Manuel Sainsily employed it to great effect in answering verbal questions about his new kitten based on the camera’s video feed.

Trying #ChatGPT’s new Advanced Voice Mode that just got released in Alpha. It feels like face-timing a super knowledgeable friend, which in this case was super helpful — reassuring us with our new kitten. It can answer questions in real-time and use the camera as input too! pic.twitter.com/Xx0HCAc4To

— Manuel Sainsily (@ManuVision) July 30, 2024

Advanced Voice Mode was subsequently released in beta to Plus and Enterprise subscribers in September, albeit without its additional visual capabilities. Of course, that didn’t stop users from going wild in testing the feature’s vocal limits. Advanced Voice, “offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions,” according to the company.

The addition of digital eyes would certainly set Advanced Voice Mode apart from OpenAI’s primary competitors Google and Meta, both of whom have in recent months introduced conversational features of their own.

Gemini Live may be able to speak more than 40 languages, but it cannot see the world around itself (at least until Project Astra gets off the ground) — nor can Meta’s Natural Voice Interactions, which debuted at the Connect 2024 event in September, use camera inputs.

OpenAI also announced today that Advanced Voice mode was now also available for paid ChatGPT Plus accounts on desktop. It was available exclusively on mobile for a bit, but can now be accessed right at your laptop or PC as well.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
OpenAI plans to make Deep Research free on ChatGPT, in response to competition
OpenAI's new typeface OpenAI Sans

OpenAI has plans to soon make its Deep Research function available for free tier ChatGPT users.

The feature has been available since early February to Plus, Pro, Enterprise, and Edu subscribers; however, the AI company plans to expand availability beyond its paid users. Deep Research goes beyond the standard query results of the brand’s more traditional AI models. The AI agent has the ability to do extended research tasks on command without the help of a human. The feature can provide a detailed report on the subject of your choosing that might take between five and 30 minutes to compile.  

Read more
Viral trend drives ChatGPT to 500 million users
glasses and chatgpt

OpenAI’s flagship service ChatGPT remains as popular as ever, with the brand having hit a 500 million active user milestone in recent days amid the Studio Ghibli viral trend that came with the brand introducing its GPT-4o-powered image generation. 

The company’s CEO, Sam Altman, shared on X on Monday that ChatGPT gained “one million users in the last hour.” He compared the user spike to the burgeoning interest in OpenAI during its early days in 2022, when the chatbot gained one million users in five days, VentureBeat noted.  

Read more
The delay is over — you can now generate images with ChatGPT for free
OpenAI ChatGPT image

After an explosive launch, a viral trend, and some melted GPUs, the new image generation feature for ChatGPT is now available to free users. The feature originally launched on March 25 but because paid subscribers utterly flooded OpenAI with requests for Ghiblified images, CEO Sam Altman announced the next day that the rollout to free users would be delayed "a while."

Luckily, it appears this delay is over just five days later -- Altman has already published another X post saying that "image gen [is] now rolled out to all free users!"

Read more