Skip to main content

ChatGPT’s latest model may be a regression in performance

chatGPT on a phone on an encyclopedia
Shantanu Kumar / Pexels
According to a new report from Artificial Analysis, OpenAI’s flagship large language model for ChatGPT, GPT-4o, has significantly regressed in recent weeks, putting the state-of-the-art model’s performance on par with the far smaller, and notably less capable, GPT-4o-mini model.

This analysis comes less than 24 hours after the company announced an upgrade for the GPT-4o model. “The model’s creative writing ability has leveled up–more natural, engaging, and tailored writing to improve relevance & readability,” OpenAI wrote on X. “It’s also better at working with uploaded files, providing deeper insights & more thorough responses.” Whether those claims continue to hold up is now being cast in doubt.

Recommended Videos

“We have completed running our independent evals on OpenAI’s GPT-4o release yesterday and are consistently measuring materially lower eval scores than the August release of GPT-4o,” the Artificial Analysis announced via an X post on Thursday, noting that the model’s Artificial Analysis Quality Index decreased from 77 to 71 (and is now equal to that of GPT-4o mini).

What’s more, GPT-4o’s performance on the GPQA Diamond benchmark decreased from 51% to 39% while its MATH benchmarks decreased from 78% to 69%.

Simultaneously, the researchers discovered more than a doubling in the speed increase of the model’s responses, accelerating from around 80 output tokens per second to roughly 180 tokens/s. “We have generally observed significantly faster speeds on launch day for OpenAI models (likely due to OpenAI provisioning capacity ahead of adoption), but previously have not seen a 2x speed difference,” the researchers wrote.

Wait – is the new GPT-4o a smaller and less intelligent model?

We have completed running our independent evals on OpenAI’s GPT-4o release yesterday and are consistently measuring materially lower eval scores than the August release of GPT-4o.

GPT-4o (Nov) vs GPT-4o (Aug):
➤… pic.twitter.com/gjY2pBFuUv

— Artificial Analysis (@ArtificialAnlys) November 21, 2024

“Based on this data, we conclude that it is likely that OpenAI’s Nov 20th GPT-4o model is a smaller model than the August release,” they continued. “Given that OpenAI has not cut prices for the Nov 20th version, we recommend that developers do not shift workloads away from the August version without careful testing.”

GPT-4o was first released in May 2024 to surpass the existing GPT-3.5 and GPT-4 models. GPT-4o offers state-of-the-art benchmark results in voice, multilingual, and vision tasks, according to OpenAI, making it ideal for advanced applications like real-time translation and conversational AI.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
ChatGPT Plus is free for a limited time: Here’s how to check if you qualify
chatgpt plus promotional offer for students.

ChatGPT didn't just emerge onto the AI scene, it birthed an entire revolution of AI assistants and agents and made them accessible to consumers who were not so friendly with technology. Despite the space now being overcrowded with numerous intelligent chatbots and wrapper apps, ChatGPT is still the most popular of them all. And while you get plenty of features for free now, ChatGPT Plus, its paid tier, gets deeper thinking abilities, priority in times of traffic surge, and quicker access to new models. The downside, however, it is $20 monthly subscription. Thankfully, a select few people can get it for free now.

OpenAI's CEO and co-founder Sam Altman recently announced on X that ChatGPT Plus will be available for free until the end of May. However, the offer is only applicable if you are a college student, and more specifically, studying in a "degree-granting schools in the United States and Canada." The idea basically is to gain popularity among college-goers by helping them cram more before finals in the coming weeks.

Read more
OpenAI plans to make Deep Research free on ChatGPT, in response to competition
OpenAI's new typeface OpenAI Sans

OpenAI has plans to soon make its Deep Research function available for free tier ChatGPT users.

The feature has been available since early February to Plus, Pro, Enterprise, and Edu subscribers; however, the AI company plans to expand availability beyond its paid users. Deep Research goes beyond the standard query results of the brand’s more traditional AI models. The AI agent has the ability to do extended research tasks on command without the help of a human. The feature can provide a detailed report on the subject of your choosing that might take between five and 30 minutes to compile.  

Read more
Viral trend drives ChatGPT to 500 million users
glasses and chatgpt

OpenAI’s flagship service ChatGPT remains as popular as ever, with the brand having hit a 500 million active user milestone in recent days amid the Studio Ghibli viral trend that came with the brand introducing its GPT-4o-powered image generation. 

The company’s CEO, Sam Altman, shared on X on Monday that ChatGPT gained “one million users in the last hour.” He compared the user spike to the burgeoning interest in OpenAI during its early days in 2022, when the chatbot gained one million users in five days, VentureBeat noted.  

Read more