Why do AI chatbots sometimes show false or misleading information?

Google’s new search feature, AI Overviews, is facing mounting backlash after users pointed out factually incorrect and misleading answers to questions.

Launched two weeks ago, AI Overview shows a summary of answers to frequently asked questions on Google Search from various sources around the web at the top of the page.

The goal of the new feature is to help users answer “more complex questions,” Google said blog post.

Instead, it has produced false answers, such as telling a user to glue cheese to pizza if it comes loose, that he should eat rocks to improve his health, or that former US President Barack Obama is Muslim, which is a conspiracy theory that has been debunked.

The AI ​​overview responses are the latest in a series of examples of cases where chatbot models respond incorrectly.

A study by Vectaraa generative AI startup, found that AI chatbots made up information between three and 27 percent of the time.

What are AI Hallucinations?

Large language models (LLMs), which power chatbots like OpenAI’s ChatGPT and Google’s Gemini, learn to predict a response based on the patterns they observe.

The model calculates the most likely next word to answer your question based on what’s in their database, said Hanan Ouazan, partner and generative AI lead at Artefact.

“That’s exactly how we work as people: we think before we talk,” he told Euronews.

But sometimes the model’s training data can be incomplete or distorted, leading to incorrect answers or “hallucinations” by the chatbot.

For Alexander Sukharevsky, a senior partner at QuantumBlack at McKinsey, it is more accurate to call AI “hybrid technology” because the chatbot responses given are “mathematically calculated” based on the data they observe.

According to Google, there is no one reason why hallucinations occur: it could be due to insufficient training data used by the model, incorrect assumptions, or hidden biases in the information the chatbot uses.

{{related align=”center” size=”fullwidth” ratio=”auto” storyIdList=”8461182″ data=’ Google’s new AI summarization tool raises concerns after producing misleading responses ‘ }}

Google has identified several types of AI hallucinations, such as incorrect predictions of events that may not actually happen, false positives from identifying non-existent threats, and false negatives that may not accurately detect a cancerous tumor.

But Google acknowledges that hallucinations can have significant consequences, such as a healthcare AI model incorrectly identifying a benign skin model as malignant, leading to “unnecessary medical interventions.”

Not all hallucinations are bad, according to Igor Sevo, head of AI at HTEC Group, a global product development company. It just depends on what the AI ​​is used for.

“In creative situations, hallucinating is good,” says Sevo, noting that AI models can write new text passages or emails in a particular voice or style. “The question now is how do we get the models to understand creative versus truthful,” he said.

{{related align=”center” size=”fullwidth” ratio=”auto” storyIdList=”8437058″ data=’ Google to roll out AI-generated summaries at the top of its search engine ‘ }}

It’s all about the data

Ouazan said the accuracy of a chatbot comes down to the quality of the data set it is fed.

“Like a [data] source is not 100 percent… [the chatbot] could say something that is wrong,” he said. “This is the main reason we hallucinate.”

For now, Ouazan says AI models use a lot of web and open source data to train their models.

{{quotation_v2 align=”center” size=”fullwidth” ratio=”auto” quote=””Ultimately, it’s a journey. Companies also don’t have good customer service from day one.”” author=”Alexander Sukharevsky, senior partner at QuantumBlack at McKinsey” }}

OpenAI in particular is also making deals with mass media organizations like Axel Springer and News Corp and publications like Le Monde to license their content so they can train their models on more reliable data.

For Ouazan, it is not that AI needs more data to formulate accurate answers, but that models need high-quality source data.

Sukharevsky said he’s not surprised AI chatbots make mistakes; they have to, so that the people running them can refine the technology and data sets as they go.

“I think ultimately it is a journey,” Sukharevsky said. “Companies also don’t have good customer service from day one,” he says.

{{related align=”center” size=”fullwidth” ratio=”auto” storyIdList=”8433946″ data=’ OpenAI rival Anthropic launches chatbot Claude in Europe to give users more choice ‘ }}

A Google spokesperson told Euronews Next that its AI surveys received many “unusual questions” that were either spoofed or could not be accurately reproduced, leading to false or hallucinatory answers.

They claim that the company conducted “extensive testing” before launching AI Overviews and is taking “quick action” to improve their systems.

How can AI companies stop hallucinations?

There are a few techniques Google recommends to slow this problem down, such as regularization, which penalizes the model for making extreme predictions.

The way to do this is to limit the number of possible outcomes the AI ​​model can predict, Google continued. Trainers can also give their model feedback, telling them what they liked and didn’t like about the answer, so the chatbot learns what users are looking for.

AI also needs to be trained with information that is “relevant” to what it will do, such as using a dataset of medical images for an AI that will help diagnose patients.

Companies with AI language models could capture the most common questions and then bring together a team of individuals with different skills to figure out how to refine their answers, Sukharevksy said.

For example, Sukharevsky said that English language experts could be well suited to fine-tune the AI, depending on what the most popular questions are.

{{quotation_v2 align=”center” size=”fullwidth” ratio=”auto” quote=””I think it’s going to be fixed because if you don’t make it [AI chatbots] more reliable, no one is going to use them.”” author=”Igor Sevo, Head of AI at HTEC Group” }}

Large companies with large computing power could also take a chance on creating their own evolutionary algorithms to improve the reliability of their models, according to Sevo.

This is where AI models would hallucinate or fabricate training data for other models with veridical information already identified by mathematical equations, Sevo continued.

If thousands of models compete to find verisimilitude, the models produced will be less prone to hallucinations, he said.

“I think it will be resolved, because if you don’t make it [AI chatbots] are more reliable, no one is going to use them,” Sevo said.

“It is in everyone’s interest that these things are used.”

Smaller companies can attempt to manually refine what data their models consider reliable or truthful based on their own standards, Sevo said, but that solution is more labor-intensive and expensive.

Users should also be aware that hallucinations can occur, AI experts say.

“I would educate myself on what [AI chatbots] are, which they are not, so I have a basic understanding of its limitations as a user,” said Sukharevksy.

“If I see things not working, I would evolve the tool.”

Leave a Comment