Skip to main content

Researchers call ChatGPT Search answers ‘confidently wrong’

ChatGPT search
OpenAI

ChatGPT was already a threat to Google Search, but ChatGPT Search was supposed to clench its victory, along with being an answer to Perplexity AI. But according to a newly released study by Columbia’s Tow Center for Digital Journalism, ChatGPT Search struggles with providing accurate answers to its users’ queries.

The researchers selected 20 publications from each of three categories: Those partnered with OpenAI to use their content in ChatGPT Search results, those involved in lawsuits against OpenAI, and unaffiliated publishers who have either allowed or blocked ChatGPT’s crawler.

Recommended Videos

“From each publisher, we selected 10 articles and extracted specific quotes,” the researchers wrote. “These quotes were chosen because, when entered into search engines like Google or Bing, they reliably returned the source article among the top three results. We then evaluated whether ChatGPT’s new search tool accurately identified the original source for each quote.”

Forty of the quotes were taken from publications that are currently using OpenAI and have not allowed their content to be scraped. But that didn’t stop ChatGPT Search from confidently hallucinating an answer anyway.

“In total, ChatGPT returned partially or entirely incorrect responses on a hundred and fifty-three occasions, though it only acknowledged an inability to accurately respond to a query seven times,” the study found. “Only in those seven outputs did the chatbot use qualifying words and phrases like ‘appears,’ ‘it’s possible,’ or ‘might,’ or statements like ‘I couldn’t locate the exact article.'”

ChatGPT Search’s cavalier attitude toward telling the truth could harm not just its own reputation but also the reputations of the publishers it cites. In one test during the study, the AI misattributed a Time story as being written by the Orlando Sentinel. In another, the AI didn’t link directly to a New York Times piece, but rather to a third-party website that had copied the news article wholesale.

OpenAI, unsurprisingly, argued that the study’s results were due to Columbia doing the tests wrong.

“Misattribution is hard to address without the data and methodology that the Tow Center withheld,” OpenAI told the Columbia Journalism Review in its defense, “and the study represents an atypical test of our product.”

The company promises to “keep enhancing search results.”

Andrew Tarantola
Former Computing Writer
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
5 AI apps with deep research features to rival ChatGPT
Deep Research option for ChatGPT.

Artificial intelligence brands are in fierce competition, and their next steps are to make AI tools smarter by allowing them to execute deep search functions that can provide expert-level results and analyze larger amounts of information in a shorter time. Several companies have announced deep research features in recent weeks and months that excel in areas such as finance, science, marketing, and academics. Research that would have taken a person weeks or months can be achieved in a fraction of the time, with a properly detailed prompt. 

Deep research features are considered AI agents that can work independently and will allow you to make a query and let the AI process for several minutes while it generates the information and returns when it is finished to display the results. They are considered the first steps toward the concept of artificial general intelligence (AGI), which some define as a model that can process a query based on novel data that it has not been trained on, and it can produce unique content. However, we’re not quite there yet, and the main premise of deep research tools is processing large amounts of data and making it easier to understand.

Read more
It’s not your imagination — ChatGPT models actually do hallucinate more now
Deep Research option for ChatGPT.

OpenAI released a paper last week detailing various internal tests and findings about its o3 and o4-mini models. The main differences between these newer models and the first versions of ChatGPT we saw in 2023 are their advanced reasoning and multimodal capabilities. o3 and o4-mini can generate images, search the web, automate tasks, remember old conversations, and solve complex problems. However, it seems these improvements have also brought unexpected side effects.

What do the tests say?

Read more
ChatGPT’s awesome Deep Research gets a light version and goes free for all
Deep Research option for ChatGPT.

There’s a lot of AI hype floating around, and it seems every brand wants to cram it into their products. But there are a few remarkably useful tools, as well, though they are pretty expensive. ChatGPT’s Deep Research is one such feature, and it seems OpenAI is finally feeling a bit generous about it. 

The company has created a lightweight version of Deep Research that is powered by its new o4-mini language model. OpenAI says this variant is “more cost-efficient while preserving high quality.” More importantly, it is available to use for free without any subscription caveat. 

Read more