Skip to main content

GPT-4o and Gemini 1.5 Pro just got beat in the AI race

a screenshot of claude 3.5 sonnet, with an 8-bit crab
Anthropic

There’s a new leader, technically, in the race for AI assistant dominance, and it’s Anthropic’s new Claude 3.5 Sonnet. The newly released model outperforms both Gemini 1.5 Pro and ChatGPT-4o across a spectrum of benchmark tests, the company announced on Thursday.

This new iteration of Sonnet is the first in Anthropic’s upcoming line of 3.5 models, and it significantly outperforms the more expansive Opus 3.0 model, and does so at a fraction of the larger model’s energy cost. Compute efficiency is becoming an increasingly important aspect of AI system design, especially as the cost of both powering and cooling AI data centers soars while the infrastructure pushes into the gigawatt range.

Claude 3.5 Sonnet for vision

“Claude 3.5 Sonnet operates at twice the speed of Claude 3 Opus,” the Anthropic team wrote in a blog post. “This performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks such as context-sensitive customer support and orchestrating multistep workflows.”

Recommended Videos

The new model has reportedly set benchmark results across three standardized tests: graduate-level reasoning with GPQA, undergraduate-level knowledge with MMLU, and coding proficiency with HumanEval. It beat out Google’s Gemini 1.5 Pro, Meta’s Llama-400b, and OpenAI’s ChatGPT-4o, though not by any huge margin and typically only by a couple percentage points.

A table showing Claude 3.5 Sonnet's performance compared to other leading AI systems.
Anthropic

Sonnet 3.5 is being billed as Anthropic’s “strongest vision model yet. ” It’s capable of performing a number of vision-based tasks — like interpreting charts and graphs or transcribing text from imperfect image sources like screenshots or scanned receipts — more accurately than Opus 3.0. In fact, Sonnet 3.5 beat out Opus 3.0 by anywhere from 6 to 17 points across industry standard vision benchmarks. The new model is also reportedly much more competent at handling humor and can converse in a much more lifelike manner.

Sonnet will also be the first Anthropic AI to offer the Artifacts feature to users. Rather than generate images or code snippets directly into the flow of the conversation, Artifacts will create that content in a dedicated space to the side of the chat. This allows users to create “a dynamic workspace where they can see, edit, and build upon Claude’s creations in real time, seamlessly integrating AI-generated content into their projects and workflows,” the Anthropic team claims. It also announced that Claude will soon support team collaboration wherein a company can store its data, documents and projects in a single, central silo, with Claude acting as an on-demand assistant.

You can try out Claude 3.5 Sonnet today for free on the Claude.ai website and the Claude iOS app (a Claude Pro or Team subscription will garner you significantly higher rate limits). Third-party integration is also available through the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude Haiku 3.5 and Opus 3.5 are scheduled for release later in the year.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Qualcomm counters Intel about its performance claims
Qualcomm's CEO presenting Snapdragon X Elite CPUs at Computex 2024.

In the year since Qualcomm first debuted its Snapdragon X Elite, the competition hasn't been silent. Intel released both Meteor Lake and Lunar Lake chips, the latter of which felt like a legitimate response to Qualcomm's advances in battery life and efficiency.

But Qualcomm isn't impressed by Intel's latest offerings.

Read more
Best desktop computer deals: The cheapest PC deals today
dell inspiron desktop deal april 2023 pc lifestyle

Working on a desktop gives a lot of clear advantages, with the main one being that desktop computers tend to be the cheapest way you can get powerful performance compared to something like a laptop, a tablet, or even a mini-PC. In exchange, you tend to need a lot more space to put your desktop, and you lose a lot of portability, but you can absolutely get some really high-end components that you might not find in other places. In fact, some of the best desktop computer can easily handle the best PC games and the most heavy-duty productivity apps.

That's why we've gone out and found our favorite deals that will give you the best bang for your buck so that you don't get too overwhelmed with all the options out there. Once you've found a good one, pair it with discount monitor deals to save some more cash. If you're looking for something better suited to gaming, then you may want to check out these gaming PC deals as well.
Lenovo IdeaCentre 3i Desktop --  $350 $400 12% off

Read more
The best AI chatbots to try: ChatGPT, Gemini, and more
Bing Chat shown on a laptop.

The idea of chatbots has been around since the early days of the internet. But even compared to popular voice assistants like Siri, the generated chatbots of the modern era are far more powerful.

Yes, you can converse with them in natural language. But these AI chatbots can generate text of all kinds, from poetry to code, and the results really are exciting. ChatGPT remains in the spotlight, but as interest continues to grow, more rivals are popping up to challenge it.
OpenAI ChatGPT and ChatGPT Plus

Read more