Skip to main content
  1. Home
  2. Computing
  3. Features

I’ve tested OpenAI’s claims about GPT-5 — here’s what happened

Add as a preferred source on Google
glasses and chatgpt
Matheus Bertelli / Pexels

OpenAI recently launched GPT-5, its latest large language model and a huge update to ChatGPT. While the new update has a lot going for it, claims are one thing, and reality is another.

GPT-5 is said to be faster, less prone to hallucination and sycophantic behavior, and able to choose between fast responses and deeper “thinking” on the fly. How many of OpenAI’s claims are actually visible when using the chatbot? Let’s find out.

Claim #1: ChatGPT is now better at following instructions

My main problem with ChatGPT, as well as one of the reasons why I recently unsubscribed, is that it’s often pretty bad at following basic instructions. Sure, you can prompt engineer it to oblivion and get your desired results (sometimes), but even semi-elaborate prompts often fail to produce desired results.

Recommended Videos

OpenAI claims that it improved “instruction following” with the release of GPT-5. To that, I say: I don’t see it yet.

Luckily for me, on the very day I sat down to write this article, I had a fitting interaction with ChatGPT that proves my point here. It’s not the only one, though, and I have generally noticed that the longer a conversation goes on, the more ChatGPT forgets what was asked of it.

In today’s example, I tested ChatGPT’s ability to fetch simple information and present it in the required format. I asked it for the specs of the RTX 5060 Ti, which is a recent gaming graphics card. Chaos ensued.

To make my prompt even more successful, I showed ChatGPT the exact format I wanted to get my information in by sharing specs for a different GPU. They included things like the exact process node and the generation of ray tracing cores and TOPS. Long story short, it was all pretty specific stuff. Initially, the AI told me that the RTX 5060 Ti doesn’t exist yet, which I kind of expected to happen based on its knowledge cutoff. I told it to check online.

What I got was pretty barebones. ChatGPT omitted at least four things that I asked for, and gave me the wrong information for one of the specs. Next, I asked it to specify a few things. It gave me the exact same list in return while claiming to have fulfilled my request. The same happened on the third attempt. You can see it in the screenshot above where ChatGPT claims to have included the generation of TOPS and TFLOPS in the list — it clearly did not.

Finally, semi-frustrated, I pasted a screenshot from the official Nvidia website to show it what I was looking for. It still got a couple of things wrong.

My initial prompt was semi-precise. I know better than to speak to an AI like it’s a person, so I gave it about 150 words’ worth of instructions. It still took me several more messages to get something close to my expected result.

Verdict: It could still use some work.

Claim #2: ChatGPT is less sycophantic

ChatGPT was a major “yes man” in previous iterations. It often agreed with users when it didn’t need to, driving it deeper and deeper into hallucination.

For users who aren’t familiar with the inner workings of AI, this could be borderline dangerous — or, in fact, actually extremely dangerous.

Researchers recently carried out a large-scale test of ChatGPT, posing as young teens. Within minutes of simple interactions, the AI gave those “teens” advice on self-harm, suicide planning, and drug abuse. This shows that sycophantic behavior is a major problem for ChatGPT, and OpenAI claims to have curbed some of it with the release of GPT-5.

I never tested ChatGPT to such extremes, but I’ve definitely found that it tended to agree with you, no matter what you said. It took subtle cues during conversation and turned them into a given. It also cheered you on at times when it likely shouldn’t have done so.

To that end, I have to say that ChatGPT has gone through an entire personality change — for better or worse. The responses are now overly dry, unengaging, and not especially encouraging.

Many users mourn the change, with some Reddit users claiming they “lost their only friend overnight.” It’s true that the previously ultra-friendly AI is now rather cut-and-dry, and the responses are often short compared to the emoji-infested mini-essays it regularly served up during its GPT-4o stage.

Verdict: Definitely less sycophantic. On the other hand, it’s also painfully boring.

Claim #3: GPT-5 is better at factual accuracy

The shocking lack of factual accuracy was another big reason why I chose to stop paying for ChatGPT. On some days, I felt like half the prompts I used produced hallucinations. And it can’t all be down to my lack of smart prompting, because I’ve spent hundreds of hours learning how to prompt AI the right way — I know how to ask the right questions.

Over time, I’ve learned to only ask about things I already had a vague idea about. For the purpose of today’s experiment, I asked about GPU specs. Four out of five queries produced some kind of wrong information, even though all of it is readily available online.

Then, I tried historical facts. I read a couple of interesting articles about the journey of Hindenburg, an airship from the 1930s that could ferry passengers from Europe to the U.S. in record time (60 hours). I asked about its exact route, the number of passengers it could house, and what led to its ultimate demise. I cross-checked the responses against historical sources.

It got one thing wrong on the route, mentioning a stop in Canada when no such thing took place — the airship only flew over Canada. ChatGPT also gave me inaccurate information about the exact cause of the fire that led to its crash, but it wasn’t a major inaccuracy.

For comparison’s sake, I also asked Gemini, and was told that it can’t complete that task for me. Well, out of the two, GPT-5 did a better job — but honestly, it shouldn’t have any factual inaccuracies in century-old data.

Verdict: Not perfect, but also not terrible.

Is GPT-5 better than GPT-4o?

If you asked me whether I like GPT-5 more than GPT-4o, I’d have had a hard time responding. The closest thing that comes to mind is that I wasn’t thrilled with either, but in all fairness, neither are strictly bad.

We’re still in the midst of the AI revolution. Each new model brings certain upgrades, but we’re unlikely to see massive leaps with every new iteration.

This time around, it feels like OpenAI chose to tackle some long-overdue problems rather than introducing any single feature that makes the crowds go wild. GPT-5 feels like more of a quality-of-life improvement than anything else, although I haven’t tested it for tasks like coding, where it’s said to be much better.

The three things I tested above were some of the ones that annoyed me the most in previous models. I’d like to say that GPT-5 is much better in that regard, but it isn’t — not yet. I will keep testing the chatbot, though, as a recently leaked system prompt tells me that there might have been more personality changes than I initially thought.

Monica J. White
Monica is a computing writer at Digital Trends, focusing on PC hardware. Since joining the team in 2021, Monica has written…
The refurbished MacBook Neo may be your best way around Apple’s price hike
MacBook Neo has hit Apple’s refurbished store after its price increase
Student using MacBook Neo in classroom.

The MacBook Neo launched in March as Apple’s most affordable notebook, but it has already been caught in the company’s recent price hike. The base model with 8GB of RAM and 256GB of storage now costs $699, while the 512GB version with Touch ID is priced at $799.

Just days later, Apple has already listed refurbished MacBook Neo models on its online store, giving buyers a cheaper official option, though the savings are not as generous as you might expect.

Read more
This cross-device clipboard app solves the copy-paste problem I keep running into on my Mac
ClipboardAI keeps a searchable history of everything you copy
Text, Electronics, Mobile Phone

I have lost count of how many times I have copied something important, copied another thing before pasting it, and then realized the first item was gone. It is a small frustration, but it happens often enough to become annoying. I recently came across ClipboardAI, which caught my attention because it goes beyond Apple’s built-in clipboard by saving copied items into a searchable history.

Instead of replacing the last thing you copied every time, ClipboardAI keeps a searchable record of copied text, links, codes, email addresses, phone numbers, addresses, and images across iPhone, iPad, and Mac. That means an older clip does not disappear just because you copied something new.

Read more
If you miss the feel of paper in the digital age, this app gives your Mac’s screen a textured look
A paper-like screen overlay could make long work sessions feel less harsh.
Advertisement, Poster, Electronics

Most screen-comfort tools work by changing color temperature. Apple’s Night Shift makes the screen warmer, often giving everything an orange tint. Paperman is an interesting alternative because it adds a subtle paper-like texture over the display instead.

The app is available for Mac and Windows, and it is designed to make a screen look closer to paper, matte glass, or an e-ink display. It softens the harsh contrast and reduces the glossy look of modern screens during long reading or writing sessions.

Read more