Reducing hallucinations of AI systems
What are hallucinations? Is there an intuitive reason for them? Can we avoid them?
A lot of people are very worried about AI hallucinations in particular from the large-language AI models (LLMs) such as ChatGPT.
“ChatGPT will continue hallucinating wrong answers for years” — Business Insider
A ‘hallucination’ is where the LLM confidently gives a wrong answer. People often say they ‘make things up’ or they ‘lie’. This is clearly a problem if you need to rely upon the answers from an AI for a task.
But I think that it is entirely natural that these LLM systems today hallucinate. In fact I think it is amazing that they get answers right so often of the box. There are also many ways to reduce the amount of hallucination and we use these extensively in our AI systems. I will cover some of these below.
ChatGPT hallucination example (Paris bridges)
Here is a hallucination from ChatGPT3.5 that I just ran and found today (mid Sept 2023).
I am asking the machine a rather difficult question to be fair. But it is a question that has a clear correct answer and one that a human could work through and get correct if they had access to the right information.
In this case neither answer from ChatGPT seems to be correct. In particular none of the named bridges are 2–3km long. Most are 200–300m long!
Why do hallucinations happen?
There is a technical answer
The technical answer is to do with the transformer architecture, auto-regressive nature of the model generation process, that it uses a single forward pass of the network, etc.
But rather than go into that let’s discuss a more intuitive explanation …
A more intuitive answer
A good way to understand hallucinations is to consider how we as humans tackle similar questions.
Let’s start with a very simple question such as
“What is the capital of France?”
This is very easy for us to answer. The answer is “Paris” of course and for most people that will be an instinctive response requiring no thinking at all. The answer is just there and available to you.
Using Daniel Kahneman’s model, we answer this using System 1 thinking. This is often called fast thinking. It is instinctive. We react. We don’t reason.
By contrast let’s take the ChatGPT example from above
“What is the second longest bridge in Paris?”
Now I think that everyone except perhaps a Paris bridge engineer or tour guide will not know the answer instinctively.
We can answer this but to do so we use System 2 thinking. This is slow thinking. It is about reasoning which means breaking the thinking down into many logical steps and elements. It is neither instinctive nor about reacting.
For me to answer this question, the process in my head goes something like this:
- There are probably lots of bridges in Paris — where do I find a list?
- Can I get a table of the lengths of all these bridges?
- Now can I go through the table of bridge lengths and find out which is second longest?
Key point— LLMs are always in System 1 mode (today)
This is why LLMs hallucinate. They naturally only do System 1 thinking. Everything is in always React mode even for very complicated questions.
You can think of it as though the LLM is always giving us its ‘instinctive’ answer every time, even for super-complex or impossible questions.
Consider what would happen if we were all forced to give an answer in System 1 mode to the question “What is the second longest bridge in Paris?”
An immediate, instinctive answer would mean that we probably all get it wrong. We would hallucinate. We might name any bridge that came into our head. We might make up a bridge name. We might be unable to react at all.
So it is obvious to me that LLMs are going to get difficult questions wrong because they don’t do that slow thinking out of the box.
I think that we should be amazed that the LLMs get answer right so often given they only use a method that is similar to System 1. These LLMs have been trained on vast amounts of internet data. They are vastly more knowledgeable than us about facts already. To quote Jeff Hinton, one of the godfathers of AI — these latest LLMs have knowledge that is probably 1000 times larger than any one human. This itself is amazing and why LLMs are proving useful.
How can we fix hallucinations?
There are lots of ways we can reduce hallucinations but not yet eliminate them completely. There are a huge number of academic papers and practical methods that cluster around a few notable methodologies, including:
- Give relevant examples of similar questions and answers to the LLM so that it can follow a pattern when answering. This is called one-shot prompting or few-shot prompting and works well but only if we can find good relevant examples to feed in which may not always be possible.
- Structured thinking approaches are where we ask the LLM to give a longer answer in which it first breaks down the question and answer into steps and then articulate its working before giving the final answer. This is still done in one pass usually, but makes answering each of the individual steps easier and more accurate. There are many variations of this including Chain of Thought. It works well in many cases particularly for maths or logical problems.
- Multi-step approaches are where we take the structured thinking further and we ask the LLM first to break down the problem and identify what the steps are. We then get it to work on each piece in turn and then at the end bring everything back together into one answer. This is very much more like a slow thinking process as might require 3 to 10 separate LLM calls in the process
- Knowledge-base approaches are where we look up a ground-truth knowledge base of facts and content and provide that information to the LLM together with the question from the user. The LLM then does a much better job of answering the question because it has the relevant facts and content ‘to hand’ to help it answer the question. This goes under various names including Retrieval Augmented Generation, and works very well but only if you can provide it with the right facts and content which is itself not a trival task to get right.
- Tools and plugins are where we have databases, APIs, calculators, and other systems which are external to the LLM and provide some functionality for specific tasks. Examples include calling a special calculator API, or looking up a database. These other systems provide ground-truth information to the LLM that it can use it its response.
- Guardrails are where we apply various independent checks to the response from the LLM. Essentially they are a ‘second pair of eyes’. There are many variations of guardrails. Some might improve the response but many others will be used to stop erroneous answers being returned to the user and therefore improve safety that way.
We use aspects of many of these in our AI systems in combination.
Improved ChatGPT example (Paris bridges)
What happens if we use some of these techniques and apply them to our tricky Paris bridges question?
The following screenshots show what happens if we manually use some of these with ChatGPT3.5.
This is much better. It might still not be the right final answer (I haven’t checked against some ground truth!) but you can see that is better reasoned than before.
Hallucinations are completely natural and to be expected from LLMs.
We already have a decent toolkit to help mitigate and manage many hallucinations and these tools and methods are only going to get better.