Skip to main content

AI is now being trained by AI to become a better AI

An OpenAI graphic for ChatGPT-4.
OpenAI

OpenAI has developed an AI assistant, dubbed CriticGPT, to help its crowd-sourced trainers further refine the GPT-4 model. It spots subtle coding errors that humans might otherwise miss.

After a large language model like GPT-4 is initially trained, it subsequently undergoes a continual process of refinement, known as Reinforcement Learning from Human Feedback (RLHF). Human trainers interact with the system and annotate the responses to various questions, as well as rate various responses against one another, so that the system learns to return the preferred response and increases the model’s response accuracy.

The problem is that as the system’s performance improves, it can outpace the level of expertise of its trainer, and the process of identifying mistakes and errors becomes increasingly difficult.

These AI trainers aren’t always subject matter experts, mind you. Last year, OpenAI got caught crowd sourcing the effort to Kenyan workers — and paying them less than $2 an hour — to improve its models’ performance.

a criticGPT screenshot
OpenAI

This issue is especially difficult when refining the system’s code generation capabilities, which is where CriticGPT comes in.

“We’ve trained a model, based on GPT-4, called CriticGPT, to catch errors in ChatGPT’s code output,” the company explained in a blog post Thursday. “We found that when people get help from CriticGPT to review ChatGPT code they outperform those without help 60 percent of the time.”

What’s more, the company released a whitepaper on the subject, titled “LLM Critics Help Catch LLM Bugs,” which found that “LLMs catch substantially more inserted bugs than qualified humans paid for code review, and further that model critiques are preferred over human critiques more than 80 percent of the time.”

Interestingly, the study also found that when humans collaborated with CriticGPT, the AI’s rate of hallucinating responses was lower than when CriticGPT did the work alone, but that rate of hallucination was still higher than if a human just did the work by themselves.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
With 400 million users, OpenAI maintains lead in competitive AI landscape
OpenAI's new typeface OpenAI Sans

Competition in the AI industry remains tough, and OpenAI has proven that it is not taking any coming challenges lightly. The generative AI brand announced Thursday that it services 400 million weekly active users as of February, a 33% increase in less than three months.

OpenAI chief operating officer, Brad Lightcap confirmed the latest user statistics to CNBC, indicating that the figures had not been previously reported. The numbers have quickly risen from previously confirmed stats of 300 million weekly users in December.

Read more
xAI’s Grok-3 is impressive, but it needs to do a lot more to convince me
Tool-picker dropdown for Grok-3 AI.

Elon Musk-led xAI has announced their latest AI model, Grok-3, via a livestream. From the get-go, it was evident that the company wants to quickly fill all the practical gaps that can make its chatbot more approachable to an average user, rather than just selling rhetoric about wokeness and understanding the universe.

The company will be releasing two versions of its latest AI model viz. Grok-3 and Grok-3 mini. The latter is trained for low-compute scenarios, while the former will offer the full set of Grok-3 perks such as DeepSearch, Think, and Big Brain.
What’s all the fuss about

Read more
Perplexity one-ups Gemini and ChatGPT with a fantastic AI freebie
Model picker for Deep Research on Perplexity Model picker for Deep Research on Perplexity

What if you tell an AI chatbot to search the web, look up a certain kind of source, and then create a detailed report based on all the information it has gleaned? Well, Gemini can do it, for $20 a month. Or $200 each month, if you prefer ChatGPT.

Perplexity will do it for free. A few times each day, that is. Perplexity is calling its latest tool, Deep Research. Just like OpenAI. And Google Gemini before it.

Read more