Skip to main content
  1. Home
  2. Computing
  3. News

OpenAI shrinks GPT-5.4 for speed and lower costs

New mini and nano models give developers faster AI at a fraction of the price.

Add as a preferred source on Google
Using ChatGPT on laptop
Freepik

OpenAI is scaling its latest models down to hit a different target, faster responses and much lower costs. The new GPT-5.4 mini and nano are built for developers who care more about responsiveness than squeezing out every last bit of reasoning power.

Both models are available starting today. GPT-5.4 mini runs more than twice as fast as its predecessor while staying close to the full GPT-5.4 on key benchmarks. GPT-5.4 nano takes that further, focusing on simpler tasks like classification and data extraction where efficiency matters most.

Recommended Videos

This approach fits apps where speed shapes the experience. Coding assistants, background agents, and real-time vision tools depend on quick feedback, and in those cases a slightly smaller model often delivers a better overall result.

How much performance you actually lose

The performance gap between models is narrower than you might expect. GPT-5.4 mini scores 54.4 percent on SWE-Bench Pro, compared to 57.7 percent for the full model. On OSWorld-Verified, the mini reaches 72.1 percent while the larger version hits 75 percent, keeping the difference tight across tasks.

Costs drop far more dramatically. GPT-5.4 mini is priced at $0.75 per million input tokens and $4.50 per million output tokens, while nano comes in at $0.20 and $1.25. Both models support text and image inputs, tool use, function calling, and a 400,000 token context window, so the lower price doesn’t strip away core capabilities.

In Codex, the mini model uses just 30 percent of the GPT-5.4 quota. That lets developers shift routine coding work to a cheaper tier while saving the full model for harder reasoning.

When smaller models do the heavy lifting

OpenAI is also pushing a multi-model workflow. Instead of relying on one system, developers can split work across tiers, pairing a larger model for planning with smaller ones handling execution.

That setup reflects how many real apps already behave. One model can review a codebase or decide on changes, while another processes supporting data or repetitive steps. The smaller model handles the predictable work, while the larger one focuses on judgment and coordination.

Early feedback suggests this mix is effective. Hebbia CTO Aabhas Sharma reported that GPT-5.4 mini matched or outperformed competing models on several tasks at a lower cost, and in some cases even delivered stronger end-to-end results than the full GPT-5.4.

What to use and when

GPT-5.4 mini is now available across the API, Codex, and ChatGPT. Free and Go users can access it through the Thinking option, while other users may see it as a fallback when they hit limits on GPT-5.4 Thinking.

The nano model is currently limited to the API, aimed at teams running high-volume workloads where cost control is critical. Both models are live today with full documentation available.

For developers building real-time AI features, the shift is clear. Smaller models are now capable enough to handle a larger share of everyday work, which makes choosing the right balance of speed, cost, and capability an increasingly practical decision.

Paulo Vargas
Paulo Vargas is an English major turned reporter turned technical writer, with a career that has always circled back to…
Gemini will now take notes for you in Google Meet for you, if you the minimum $20 AI tax
Yet another Google subscription just dropped for Gemini
Google Meet Take Notes for me Gemini

Google has just released a useful Gemini feature, which you can try if you are a paying member of course. The company is now bringing "Take notes for me" for Gemini, which will be available in Google Meet for Google AI Pro and Google AI Ultra subscribers, along with eligible Workspace business customers.

For personal users, the feature starts with Google AI Pro, which costs $19.99 per month in the US. In other words, Gemini can now take your Google Meet notes, provided you pay the minimum AI tax.

Read more
After iPad Pro and MacBook Pro, the iMac could be the next in line for an OLED screen upgrade
iMac with M4

The iPhone got an OLED panel in 2017, while the iPad Pro followed in 2024. Even the MacBook Pro is expected to follow later this year or early next year. But what about the iMac?

According to TrendForce, the iMac could get an OLED upgrade. There's no timeline yet, but the direction is clear. Apple wants to replace its current display technologies with OLED, raising the bar for color quality for both regular users and professionals.

Read more
This $1,299 gaming PC wants to be a Steam Machine without waiting for Valve
Valve’s Steam Machine dream is already real in MetaPC's new prebuilt
MetaPC's Steamroller is a new Steam Machine rival

Valve’s Steam Machine may be the face of SteamOS, but the platform isn't exclusive to it. A big announcement after Steam Machine's unveiling was that SteamOS would be arriving on systems outside of the new hybrid console. Now, MetaPCs is one of the first to take advantage of this by opening the preorders for the Steamroller, a new prebuilt gaming desktop that ships with SteamOS installed by default.

Though Steamroller is not trying to be a tiny console-like cube. It is a normal desktop PC with standard parts and a real upgrade path. The system costs $1,299 and is listed with a preorder date of July 3, 2026.

Read more