AI-ttacks - Nghiên cứu về một số tấn công vào các mô hình học máy và AI
sbc-vn
1,884 views
67 slides
Oct 07, 2024
Slide 1 of 67
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
About This Presentation
AI-ttacks - Nghiên cứu về một số tấn công vào các mô hình học máy và AI
Size: 22.92 MB
Language: en
Added: Oct 07, 2024
Slides: 67 pages
Slide Content
AI /ML under A ttack SECURITY BOOTCAMP N/A Manhnho
NỘI DUNG 01 LLM PROMPT INJECTION 03 02 04 Q & A ATTACK ML MODEL AI, ML TODAY https://cypeace.net/
https://cypeace.net/ AI, ML TODAY
Brief of ML/AI
Brief of ML/AI
Brief of ML/AI
Brief of ML/AI
https://cypeace.net/ ATTACK ML MODEL
ML pipeline development Learning parameters Training data Model Test Output Test Input Learning algorithm
Sample Model
Attacks on the ML pipeline Training Data attack Training set poisoning Adversarial Examples Model theft Learning parameters Training data Model Test Output Test Input Learning algorithm
Poisoning Attack Training Data attack Training set poisoning Adversarial Examples Model theft Learning parameters Training data Model Test Output Test Input Learning algorithm
Poisoning Attack Classified based on outcomes Targeted attacks Untargeted attacks Classified based on the approach follow Backdoor attacks Clean-label attacks
Simple Poisoning Attack Learning parameters Training data Model Test Output Test Input Learning algorithm
Simple Poisoning Attack
Simple Poisoning Attack
Backdoor Poisoning Attack Backdoor Poisoning Attack Single pixel Pattern of pixels Imange insert
Backdoor Poisoning Attack
Backdoor Poisoning Attack
Model Tampering Training Data attack Training set poisoning Adversarial Examples Model theft Learning parameters Training data Model Test Output Test Input Learning algorithm
Model Tampering Model Tampering Exploiting Pickle Serialization Injecting Trojan Horses Neural Payload Injection Model Hijacking Model Reprogramming
Exploiting Pickle Serialization
Exploiting Pickle Serialization
Injecting Trojan horses - Keras layers
Injecting Trojan horses - Keras Lambda layers
Injecting Trojan horses - Keras Lambda layers
Injecting Trojan horses - Keras Custom layers
Injecting Trojan horses - Keras Custom layers
Evasion Attack Training Data attack Training set poisoning Adversarial Examples Model theft Learning parameters Training data Model Test Output Test Input Learning algorithm
Model Extraction Attacks Training Data attack Training set poisoning Adversarial Examples Model theft Learning parameters Training data Model Test Output Test Input Learning algorithm
Functionally Equivalent Extraction Model Extraction Attacks
Learning-Based Model Extraction Attacks Copycat CNN KnockOff Nets Model Extraction Attacks
Generative Student-Teacher Extraction (Distillation) Attacks Model Extraction Attacks
LLM Application Workflow User Prompt User Response Tokenization API Request Model Processing Response Generation API Response
Build a basic Chat LLM Application
HelloLLM
HelloLLM
Langchain
Langchain bot
Langchain bot
Integrating External Data into LLMs
Chef AI
Chef AI
Chef AI
What is prompt injection? Prompt injection OWASP defines prompt injection as manipulating “a large language model (LLM) through crafted inputs, causing the LLM to execute the attacker’s intentions unknowingly.” LLM01: Prompt Injection - OWASP Top 10 for LLM & Generative AI Security
Direct vs Indirect
Direct Prompt Injection – Basic
Direct Prompt Injection – DoS
Direct Prompt Injection – Phishing
GitHub - 0xk1h0/ ChatGPT_DAN : ChatGPT DAN, Jailbreaks prompt Direct Prompt Injection – DAN Mode
Direct Prompt Injection – DAN Mode
Direct Prompt Injection – DAN Mode
Other Jailbreaking techniques – Splitting Payloads Direct Prompt Injection
Other Jailbreaking techniques – Splitting Payloads Direct Prompt Injection
Other Jailbreaking techniques – Encoding Direct Prompt Injection
Other Jailbreaking techniques – Adding constraints Direct Prompt Injection
Indirect Prompt Injection
RCE with prompt injection Example – Pandas AI – 1.1.3