社内勉強会資料_Hallucination of LLMs　　　　　　　　　　　　　　　.

NABLAS 354 views 14 slides Jun 14, 2024

Slide 1 of 14

About This Presentation

社内エンジニア・リサーチャー勉強会の発表資料「Hallucination of LLMs」を公開しました！

大規模言語モデル(LLM)に起こるハルシネーションのメカニズムと、それらを軽減する方法について紹介しています。

Size: 902.12 KB

Language: en

Added: Jun 14, 2024

Slides: 14 pages

Slide Content

Shuheng You
06/05/2024
Hallucination of LLMs
Paper Discussion

Background
Hallucination
“Generated content that appears factual but is ungrounded”
•We want to look at the possible underlying mechanism leading to the problem
2

Background
Heuristic Solutions
Chain-of-Verification: 
use LLMs to generate 
verification questions
3
Chain-of-Verification Reduces Hallucination in Large Language Models. https://arxiv.org/abs/2309.11495

Background
LMvLM: 
use another LLM to interact to find 
inconsistencies
4
Heuristic Solutions
LM vs LM: Detecting Factual Errors via Cross Examination. 
https://aclanthology.org/2023.emnlp-main.778/

Do LLMs Know What They Know?
‣P(True): the probability a model assigns to if a specific sample is the correct
answer to a question
Ask an LLM whether its own answer to a question is correct (few-shot)
5
Introduction of P(True)
Language Models (Mostly) Know What They Know. https://arxiv.org/abs/2207.05221

Do LLMs Know What They Know?
‣Models can self-evaluate their own samples with reasonable accuracy
6
Experiment on P(True)

Do LLMs Know What They Know?
‣P(IK): the probability a model assigns to if "I know" 
 
i.e. whether it will be able to answer a given question correctly
‣Input: question itself
‣Output: the probability 
 
through an additional binary classification head on top of the model
7
Introduction of P(IK)

Do LLMs Know What They Know?
P(IK) regarding the president of Absurdistan << P(IK) regarding the US
8
Visualization of P(IK)

Do LLMs Know What They Know?
We care about both in-distribution and out-of-bound performance of P(IK)
•In-distribution performance measures how much reliable is P(IK) trained within
a given task
•Out-of-bound performance measures the generalization ability of a trained
P(IK) on a new task
9
Experiment on P(IK)

Do LLMs Know What They Know?
Ground truth P(IK): the actual correct samples/total generated samples
10
Experiment on P(IK)

Residual Streams Across Layers
Analysis of all L hidden states and the tokens that can be predicted from them
Given different prompts (some succeed some fail to predict the correct answer)
11
Residual Streams
On Large Language Models' Hallucination with Regard to Known Facts. https://arxiv.org/abs/2403.20009
Decoder Layer
Hidden State
L *

Residual Streams Across Layers
Success token: 
the activation of the
correct token when
given the optimal prompt
Failed token: 
the activation of the
correct token when
given failed prompts
Hallucinated token: 
the activation of the
incorrect token
12
Dynamics of Residual Streams

Residual Streams Across Layers
The dynamic of the correct token in a model
Accuracy of a trained SVM classifier on the plot:
13
Use the Pattern as a Classifier

Issues and Discussion
Issues:
•Methods are more effective to short questions (especially single token), and
often fail when given longer ones
•Only available for open source LLMs
Discussion:
•Do you think these methods are practical in production scenarios?
•If not, what do you think are the drawbacks and potential problems?
14
From the Two Papers

社内勉強会資料_Hallucination of LLMs　　　　　　　　　　　　　　　.

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

社内勉強会資料_Hallucination of LLMs .

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx

社内勉強会資料_Hallucination of LLMs　　　　　　　　　　　　　　　.