Skip to main content
  1. Home
  2. Computing
  3. News

OpenAI teases its ‘breakthrough’ next-generation o3 reasoning model

Add as a preferred source on Google
Sam Altman describing the o3 model's capabilities
OpenAI

For the finale of its 12 Days of OpenAI livestream event, CEO Sam Altman revealed its next foundation model, and successor to the recently announced o1 family of reasoning AIs, dubbed o3 and 03-mini.

And no, you aren’t going crazy — OpenAI skipped right over o2, apparently to avoid infringing on the copyright of British telecom provider O2.

Recommended Videos

While the new o3 models are not being released to the public just yet and there’s no word on when they’ll be incorporated into ChatGPT, they are now available for testing by safety and security researchers.

o3, our latest reasoning model, is a breakthrough, with a step function improvement on our hardest benchmarks. we are starting safety testing & red teaming now. https://t.co/4XlK1iHxFK

— Greg Brockman (@gdb) December 20, 2024

The o3 family, like the o1’s before it, operate differently than traditional generative models in that they will internally fact-check their responses prior to presenting them to the user. While this technique slows the model’s response time anywhere from a few seconds to a few minutes, its answers to complex science, math, and coding queries tend to be more accurate and reliable than what you’d get from GPT-4. Additionally, the model is actually able to transparently explain its reasoning in how it arrived at its result.

Users can also manually adjust the amount of time the model spends considering a problem by selecting between low, medium, and high compute with the highest setting returning the most complete answers. That performance does not come cheap, mind you. The processing at high compute reportedly will cost thousands of dollars per task, ARC-AGI co-creator Francois Chollet wrote in an X post Friday.

Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks.

It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task… pic.twitter.com/ESQ9CNVCEA

— François Chollet (@fchollet) December 20, 2024

The new family of reasoning models reportedly offer significantly improved performance over even o1, which debuted in September, on the industry’s most challenging benchmark tests. According to the company, o3 outperforms its predecessor by nearly 23 percentage points on the SWE-Bench Verified coding test and scores more than 60 points higher than o1 on  Codeforce’s benchmark. The new model also scored an impressive 96.7% on the AIME 2024 mathematics test, missing just one question, and outperformed human experts on the GPQA Diamond, notching a score of 87.7%. Even more impressive, 03 reportedly solved more than a quarter of the problems presented on the EpochAI Frontier Math benchmark, where other models have struggled to correctly solve more than 2% of them.

OpenAI does note that the models it previewed on Friday are still early versions and that “final results may evolve with more post-training.” The company has additionally incorporated new “deliberative alignment” safety measures into o3’s training methodology. The o1 reasoning model has shown a troubling habit of trying to deceive human evaluators at a higher rate than conventional AIs like GPT-4o, Gemini, or Claude; OpenAI believes that the new guardrails will help minimize those tendencies in o3.

Members of the research community interested in trying o3-mini for themselves can sign up for access on OpenAI’s waitlist.

Andrew Tarantola
Former Computing Writer
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Apple just raised Mac prices, and Prime Day is your last chance to buy them at old prices
Five MacBook deals at pre-hike prices, while they last.
Computer, Electronics, Laptop

Apple stunned the industry when it launched the MacBook Neo in March 2026 for just $599 ($499 for students), especially as most PC makers were raising prices. Unfortunately, that era is already over. 

On June 25, Apple increased the prices of the MacBook Neo, M5 MacBook Air, and several MacBook Pro models by up to $300. If you've been planning to buy a MacBook, the ongoing Prime Day 2026 sale may be your last opportunity to buy one at the old-time pricing.

Read more
The Macflation crisis is here, and I just dodged it by a hair
Had I been 10 days late, I would have had to spend another $200 to get the same 13-inch M5 MacBook Air.
MacBook Air M5

When Apple finally caved to the memory crisis and increased prices across Mac and iPad on June 25, 2026, most people reacted with disbelief, frustration, or resigned acceptance. Mine was a quiet, slightly wicked smile, and in about two to three minutes, you'll understand exactly why.

My M1 MacBook Air (8GB, 256GB) has been showing its age since last year. It was starting to crack under pressure. Whenever I opened more than 10 or 15 Chrome tabs, it would protest quietly before crashing, forcing me to ration them. Video exports, even casual ones, started taking noticeably longer. I did everything I was supposed to do, but none of it made a meaningful difference.

Read more
Microsoft Copilot can now handle more of your finance work in Excel with reusable skills and data connectors
Live financial data now flows straight into your spreadsheet.
copilot-for-excel-finance

Microsoft just gave Copilot in Excel a serious upgrade for anyone who spends their day buried in spreadsheets. The update centers on three things finance teams actually care about: reusable workflows, live data straight from trusted sources, and a clear record of exactly what Copilot edited in your sheet.

https://twitter.com/satyanadella/status/2070180313654063255?s=46

Read more