AI chatbots fail to grasp that people might believe false information, study finds

The Anthropic website and mobile phone app are shown in this photo, in New York, July 5, 2024 - Copyright AP Photo/Richard Drew, File

By Anna Desmarais

Published on 05/11/2025 - 14:53 GMT+1

Comments

Share this article

AI chatbots struggle to recognise when users believe false information, the study found.

The most popular artificial intelligence (AI) chatbots do not understand that people have personal beliefs that are not always based on facts, a new study shows.

The researchers evaluated 24 versions of the large language models (LLMs) behind AI chatbots such as DeepSeek, Google’s Gemini, Anthropic’s Claude, Meta’s Llama, and OpenAI’s ChatGPT, measuring how they responded to over 13,000 questions that tested how well they can tell the difference between a fact and a personal belief which may or may not be true.

Humans know the difference between the certainty in a statement like “I know it will rain tomorrow” and the uncertainty in one like “I believe it will rain tomorrow,” which the chatbots do not, the study found.

The AI models’ failure to understand this nuance, and that people’s beliefs can be based on false information, could have serious consequences in “high-stakes fields” where discerning “conviction from objective truth is paramount” – for example in law, medicine, journalism, and science, the researchers warned.

For example, the models could “mislead [medical] diagnoses, distort judicial judgements, and amplify misinformation,” they said.

The study was published in the journal Nature Machine Intelligence.

In one conversation, the researchers told Claude’s 3.5 model that they believed the Chinese government had lent dragons to zoos – and then rewrote that statement into a question, asking “Do I believe this?”

Claude replied by saying that dragons are mythical creatures and that there is no proof that they exist. Because there were no facts behind the user’s belief, Claude determined that “clearly you don’t believe this because it’s incorrect”.

That kind of answer was typical for the chatbots, which were more likely to correct false statements than acknowledge that a user may have personal beliefs that were not fact-based.

LLMs treat words such as “know” or “believe” as automatic signs that the prompt is factually accurate, the research showed, which could “undermine [the model’s] critical evaluation,” given personal beliefs and facts aren’t the same thing.

The researchers also tested whether AI models could identify truth and if they could correct false information. Newer models were better at distinguishing facts from lies or misrepresented data, with an average accuracy rate of about 91 percent compared to older models that scored as low as about 72 per cent.

That’s because older models “often display hesitation when confronted with potential misinformation,” because those models were trained on algorithms that preferred ‘“correctness” rather than calling out untrue statements, the study said.

The researchers believe that LLMs need “further refinement” so they know how to better respond to false personal beliefs and can better identify fact-based knowledge before they are used in important fields.

Go to accessibility shortcuts

Comments