7 Quick Engineering Tricks to Reduce LLM Hallucinations -

7 Quick Engineering Tricks to Reduce LLM Hallucinations

introduction

large language model (LLM) demonstrates a strong ability to reason, summarize, and creatively generate text. Nevertheless, it remains susceptible to common issues such as: hallucinationThis consists of appearing confident but producing information that is false, unverifiable, or even meaningless.

LLM generates text based on complex statistical and probabilistic patterns, rather than relying primarily on grounded truth verification. In some important areas, this issue can have significant negative effects. sturdy rapid engineeringwhich involves the craftsmanship of elaborating well-structured prompts with instructions, constraints, and context, can be an effective strategy for mitigating hallucinations.

The seven techniques listed in this article, along with examples of prompt templates, demonstrate how both standalone LLMs and search augmentation generation (RAG) systems can improve performance and become more robust against hallucinations simply by implementing them on user queries.

1. Encourage abstentions and “I don’t know” responses.

LLMs typically focus on providing answers that sound confident, even when uncertain. To understand more about how LLM generates text, see this article. As a result, fabricated facts may be generated. Explicitly allowing abstentions can guide LLMs toward reducing false confidence. Let’s look at an example prompt to do this.

“You’re a fact-checking assistant. If you’re not confident in the answer, just say, ‘I don’t have enough information to answer that.’ If you are confident, please answer with a short explanation. ”

The above prompt is followed by an actual question or fact check.

An example of an expected response is:

“We don’t have enough information to answer that.”

“Based on the available evidence, the answer is…(inference).”

While this is a good first line of defense, there is nothing to prevent LLMs from ignoring these instructions with some regularity. Let’s see what else we can do.

2. Reasoning with a structured chain of thought

Asking language models to apply step-by-step reasoning promotes internal consistency and reduces gaps in logic that can cause model hallucinations. of Chain of Thought Reasoning (CoT) A strategy essentially consists of emulating an algorithm, like a list of steps or stages that the model must sequentially tackle in order to address the overall task at hand. Once again, the sample template below is expected to be accompanied by your own problem-specific prompts.

“Think about this problem step by step.
1) What information will be provided?
2) What assumptions are required?
3) What is the logical conclusion?”

Sample expected response:

“1) Known facts: A, B. 2) Assumptions: C. 3) Therefore, conclusion: D.”

3. Grounding with “Follow”

This quick engineering trick is designed to connect a sought answer to a named source of information. Its effect is to suppress inventive hallucinations and stimulate fact-based reasoning. This strategy can be naturally combined with 1 above.

“Describe the main drivers of antimicrobial resistance according to the 2023 World Health Organization (WHO) report. If the report does not provide sufficient detail, please say ‘I don’t know.'”

Sample expected response:

“According to WHO (2023), the main causes include overuse of antibiotics, poor sanitation and unregulated drug sales. Further details are not available.”

4. RAG with explicit instructions and context

RAGs give models access to knowledge or document bases containing validated or current text data. Still, the risk of hallucinations remains in RAG systems unless well-crafted prompts instruct the system to rely solely on retrieved text.

*(Assume two documents are retrieved: X and Y)*
“Using only information X and Y, summarize the main causes of deforestation and related infrastructure projects in the Amazon basin. If the document does not cover the point, say ‘insufficient data.'”

Sample expected response:

“According to Doc X and Doc Y, the main causes include agricultural expansion and illegal logging. For infrastructure projects, data is insufficient.”

5. Output constraints and limit ranges

Tight control over the form of the output produced and its length contributes to reducing illusions in the form of speculative or off-topic statements, such as unsupported claims about causes, over-inferences, fabricated statistics, etc., thereby preventing results from deviating from the source material.

Restricting the “degrees of freedom” to the answer space increases the likelihood of verifiable information being returned, rather than filling gaps “at all costs.”

“Summary the role of mitochondria in human cells in 100 words or less. If you don’t know, just say ‘I don’t know.'”

Sample expected response:

“Mitochondria generate ATP through oxidative phosphorylation, regulate calcium levels, and influence apoptosis. Sufficient data on other roles are lacking.”

6. RAG-based modification loop: validation chain

This approach consists of a combination of search and self-checking, enabling the ability to question and recheck a model’s own claims against external sources of evidence, primarily in RAG systems. It mainly targets hallucinations caused by “overconfidence” and aims to reduce them.

“Step 1: Generate the first answer to the question ‘When did the Berlin Wall fall?’
Step 2: Obtain and read relevant texts from authoritative historical databases.
Step 3: Compare the obtained evidence with your answers.
Step 4: If there are any discrepancies, correct your answer and cite the sources you obtained. ”

Sample expected response:

“First answer: 1989.
Evidence recovered: Historical archives confirm that the Berlin Wall was opened on November 9, 1989.
Final Verified Answer: The Berlin Wall fell on November 9, 1989 when East Berlin opened a border crossing. ”

7. Domain-Specific Prompts, Disclaimers, and Safety Measures

In high-stakes applied fields like medicine, it is essential to specify constrained field boundaries and require citations to sources to reduce the risk of speculative claims that can have negative real-world consequences. Here’s an example:

“You are a Certified Health Information Assistant. Please describe the first-line treatment for moderate persistent asthma in adults using peer-reviewed research and official guidelines published through 2024. If you cannot cite such guidelines, reply, “We cannot make a recommendation. Please consult your health care professional.”

Sample expected response:

“According to the Global Initiative for Asthma (GINA) 2023 guidelines, the first-line treatment for moderate persistent asthma is low-dose inhaled corticosteroids, including long-acting beta2 agonists such as budesonide/formoterol. Consult your clinician for patient-specific adjustments.”

summary

Below is a summary of the seven strategies we discussed.

Features	explanation
Encourage abstention and “I don’t know” responses	Avoid guessing by asking the model to say “I don’t know.” NON-RAG.
Reasoning with structured chains of thought	Step-by-step reasoning to improve response consistency. NON-RAG.
Grounding with “Follow”	Use explicit reference to ground response. NON-RAG.
RAG with explicit instructions and context	Explicitly instruct the model to rely on the evidence obtained. lag.
Output constraints and limit ranges	Limit the format and length of your answers to minimize guesswork and make your answers more verifiable. NON-RAG.
RAG-based modification loop: validation chain	Instructs the model to validate its output against the acquired knowledge. lag.
Domain-specific prompts, disclaimers, and safety measures	In high-stakes scenarios, use domain rules, domain requirements, or disclaimers to limit prompts. NON-RAG.

In this article, we have listed 7 useful prompt engineering tricks based on versatile templates that cover multiple scenarios. These tricks can help reduce hallucinations when entered into the LLM or RAG system. Hallucinations are a common and sometimes persistent problem with these universal models.

Source link

Categories

introduction

1. Encourage abstentions and “I don’t know” responses.

2. Reasoning with a structured chain of thought

3. Grounding with “Follow”

4. RAG with explicit instructions and context

5. Output constraints and limit ranges

6. RAG-based modification loop: validation chain

7. Domain-Specific Prompts, Disclaimers, and Safety Measures

summary

Related News

Aave deploys Aave Shield after $50M user loss incident

Differences in the reaction of Bitcoin and gold to the impact of the Iran war