Skip to main content

AI is now being trained by AI to become a better AI

An OpenAI graphic for ChatGPT-4.
OpenAI

OpenAI has developed an AI assistant, dubbed CriticGPT, to help its crowd-sourced trainers further refine the GPT-4 model. It spots subtle coding errors that humans might otherwise miss.

After a large language model like GPT-4 is initially trained, it subsequently undergoes a continual process of refinement, known as Reinforcement Learning from Human Feedback (RLHF). Human trainers interact with the system and annotate the responses to various questions, as well as rate various responses against one another, so that the system learns to return the preferred response and increases the model’s response accuracy.

The problem is that as the system’s performance improves, it can outpace the level of expertise of its trainer, and the process of identifying mistakes and errors becomes increasingly difficult.

These AI trainers aren’t always subject matter experts, mind you. Last year, OpenAI got caught crowd sourcing the effort to Kenyan workers — and paying them less than $2 an hour — to improve its models’ performance.

a criticGPT screenshot
OpenAI

This issue is especially difficult when refining the system’s code generation capabilities, which is where CriticGPT comes in.

“We’ve trained a model, based on GPT-4, called CriticGPT, to catch errors in ChatGPT’s code output,” the company explained in a blog post Thursday. “We found that when people get help from CriticGPT to review ChatGPT code they outperform those without help 60 percent of the time.”

What’s more, the company released a whitepaper on the subject, titled “LLM Critics Help Catch LLM Bugs,” which found that “LLMs catch substantially more inserted bugs than qualified humans paid for code review, and further that model critiques are preferred over human critiques more than 80 percent of the time.”

Interestingly, the study also found that when humans collaborated with CriticGPT, the AI’s rate of hallucinating responses was lower than when CriticGPT did the work alone, but that rate of hallucination was still higher than if a human just did the work by themselves.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Everything you need to know about OpenAI’s browser-based agent, Operator
Operator home screen

OpenAI has finally entered the agentic AI race with the release of its Operator AI in January. The agentic system is designed to work autonomously on its user's behalf and is primed to compete against already established industry rivals like Claude's Computer Use API and Microsoft's Copilot agents -- at least, once it sheds its "research preview" status. Here's everything you need to know about OpenAI's new agent and when you might be able to try it for yourself.
What is Operator?
OpenAI's Operator is an agent AI, meaning that it is designed to take autonomous action based on the information available to it. But unlike conventional programs, AI agents are able to review changing conditions in real-time and react accordingly, rather than simply execute predetermined commands. As such, AI agents are able to perform a variety of complex, multi-step tasks ranging from transcribing, summarizing and generating action items from a business meeting to booking the flight, hotel accommodations, and rental car for an upcoming vacation based on your family's various schedules to autonomously researching topics and assembling multi-page studies on those subjects.

Operator works slightly differently than other agents currently available. While Claude's Computer Use is an API and Microsoft's AI agents work within the Copilot chat UI itself, Operator is designed to, well, operate, within a dedicated web browser window that runs on OpenAI's servers and executes its tasks remotely. Your local web browser has nothing to do with the process and can be used normally even when Operator is running.

Read more
OpenAI’s rebrand is meant to make the company appear ‘more human’
OpenAI's new typeface OpenAI Sans

OpenAI has unveiled a rebrand that brings changes to its logo, typeface, and color palette. It is the company’s first rebrand since it became notable in 2022 with the popularity of its ChatGPT chatbot. 

OpenAI, Head of Design Veit Moeller, and Design Director Shannon Jager spoke with Wallpaper about the rebrand changes noting that the company aimed to create a “more organic and more human” image visual identity. This included collaborating with outside partners to develop a new typeface, OpenAI Sans that is unique to the brand. It is a look that “blends geometric precision and functionality with a rounded, approachable character,” OpenAI said in its mission statement.

Read more
A new government minister for AI has yet to use ChatGPT
The ChatGPT website on an iPhone.

 

Ireland’s newly appointed minister for AI oversight has admitted that she’s never used ChatGPT and hasn’t yet downloaded the hot new chatbot DeepSeek to her phone, the Irish Independent reported on Tuesday.

Read more