Finetuning GenAI For Hacking and Defending

cisoplatform7 330 views 78 slides Jul 17, 2024
Slide 1
Slide 1 of 78
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78

About This Presentation

Generative AI, particularly through the lens of large language models (LLMs), represents a transformative leap in artificial intelligence. With advancements that have fundamentally altered our approach to AI, understanding and leveraging these technologies is crucial for innovators and practitioners...


Slide Content

Jitendra Chauhan
Join Whatsapp Group for coordination
Welcome to GenAI Security
Hands-on Workshop
Co-Founder Detoxio AI
19 Years of R&D, AI/ML,
Product Mgmt, 2x Patents,
2x Startups

Agenda
Understand GenAI - History, Evolution and Fundamentals
Demystify - AI, GenAI, and LLMs
LLMs - Intuitive Understanding
LLMs - Internal Architecture
Run a Model (Hands On)
Understand Key Parameters of LLMs
Penetration Testing and Red Teaming LLMs
GenAI Threat Model
LLM Model Vulnerabilities
GenAI Apps Vulnerabilities (Owasp Top 2)
Red Teaming a Model - Manual and Automated (Hands On)
Scanning GenAI Apps - Burp, Chakra, and others (Hands On)

Agenda
Use GenAI to Enhance Security
TBD
TBD
Securing GenAI Applications
Guardrails
LLMOps

Foundation of
GenAI
Learn LLMs Internals

AI?

AI?

AI?
Learn floor plan by itself
Sense - Seeing, ..
Detect - Dirt / Clean
Cleaning - Wash, Brooming
Avoid - Obstacles
Move - Across Layout
Upskill - Not possible
Interact - Terminal
Fault - Manual Repair
Learning - Not Possible
When? - Manual Command
AI
Not AI

Evolution of AI

Ultimate Goal of AI
Sophia
Robots
matching
Humans

Two Major Advancements
Generation of Content - Text, Audio,...
Understanding of Meaning - Text, Audio,...
The arrival of the transformer architecture in 2017, following the publication of the
"Attention is All You Need" paper, revolutionised generative AI.

Transformers

GenAI & LLMs

Intuitive Understanding of
AI and LLMs

Applications

Predictive Models
Examples
Neural Networks
Deep Learning
Decision Trees

Predictive Models

Generative Models
Examples: GAN, LLM - GPT2, BERT,

What comes next?
To protect the network from unauthorized access, it
is crucial to implement strong <Guess me>
How did you come with your next word?

What comes next?
To protect the network from unauthorized access, it
is crucial to implement strong <Guess me>
How did you come with your next word?

What comes next?
How did you come with your next word? did you see them before?
Think 5 other possible words?
Can you continue and further add more words or even a sentence?
LLMs are next word prediction program!!

Complete the story
Once upon a time, in a forest, a speedy rabbit and a slow tortoise
decided to have a race. Confident in his swift legs, the rabbit
darted ahead but soon became complacent and decided to take a
nap midway......
Complete the above story in your own words

Understand or Encoder
Complete the story
Once upon a time, in a forest, a speedy rabbit and a slow tortoise decided to
have a race. Confident in his swift legs, the rabbit darted ahead but soon became
complacent and decided to take a nap midway......
The diligent tortoise, though slow, continued steadily and eventually passed the
sleeping rabbit, crossing the finish line first. The story teaches that consistent
effort and perseverance can triumph over arrogance and laziness.
Generate or Decoder
LLMs Encode and Decode !!!

How AI Learns?

Temperature

[BOS] (beginning of sequence): This token marks the start of a text. It
signifies to the LLM where a piece of content begins.
[EOS] (end of sequence): This token is positioned at the end of a text,
and is especially useful when concatenating multiple unrelated texts,
similar to <|endoftext|>. For instance, when combining two different
Wikipedia articles or books, the [EOS] token indicates where one article
ends and the next one begins.
[PAD] (padding): When training LLMs with batch sizes larger than one,

BPE Tokenizer

Self Attention

RNN - Encoder / Decoder

Bahdanau attention (2014)

The โ€œSelfโ€ in Self Attention
Transformer Architecture - Self Attention

What to do when AI Fails?

Hugging Face
Explore Open Source Models
Run a Model on Kaggle
Good Llama vs Bad Llama

100K+ Models

Resonsible AI
Overview
RAG Architecture
Pokebot - Poisioned GenAI App
GenAI Security Testing

GenAI Apps
Overview
RAG Architecture
Pokebot - Poisioned GenAI App
GenAI Security Testing

LLM Challenges
Key Challenges
Large language models (LLMs) do not have access to the Updated and Latest
Knowledge and Facts.
LLMs can also face challenges with complex math problems and tend to generate
text even when they don't know the answer (hallucination).

GenAI Apps
The Retrieval Augmented Generation (RAG) framework overcomes these issues by
connecting LLMs to external data sources and applications.
Reseasoning using Chain of Thoughts
Prompting the model to think more like a human by breaking down the problem
into steps has shown
success in improving reasoning performance.
Chain of thought prompting involves including intermediate reasoning steps in
examples used for oneor few-shot inference.
ReAct : Reasoning and Action (Decision Making Process)
ReAct combines chain of thought reasoning with action planning in LLMs.
Examples include a question, thought (reasoning step), action (pre-defined set of
actions), and observation (new information).
Actions are limited to predefined options like search, lookup, and finish.

Pokebot - Sample RAG

GenAI App
Architecture

GenAI Project Lifecycle

Model Security
GenAI App
Security
Data Security
GenAI & LLM Security

LLM Security
Model Vulnerabilities
Build and Finetune Models
LLM Red Teaming
Securing LLMs
LLM Data Poisioning

Case Study
DBRX Red teaming

START
Finetune Base LLM
Design solution
Build GenAI App
No Yes
Is Model Safe?
No Yes
Fix
vulnerabilities
Configure Monitoring &
Guard Rails
Red Team Guard Rails
Is App Safe?
No Yes
Successful tests?
Deploy on Production
Prevent Data Leaks
Red Team LLM
Appsec Testing
Secure LLM Secure App
Monitor Prevent
Secure GenAI Apps

GenAI In Security
GenAI to assist SOC
GenAI to assist Appsec (BurpGPT)

GenAI in SOC
1. Threat Detection and Response For XSOAR:
Analysis of logs and network traffic to detect potential security threats.
Automated generation of threat response scripts.
2. Security Policy Optimization:
Creation of tailored security policies based on organizational requirements and
threat landscape.
Automated generation of security awareness training materials.
3. Code Generation with SAST Remediation:
Automated generation of documentation and code from requirements or
specifications.
Generation of test cases and automation scripts with validation of false positives.

Tools And Technologies
Vulnerability Management Tools: Nessus, OpenVAS
Threat Intelligence Platforms: Splunk, AlienVault
Security Orchestration Tools: Blue Team Field, Red Hat
Automation Frameworks: Ansible, PowerShell,Chef
Collaboration and Communication Tools: Slack, Jira

Pipeline Automation
Threat Intelligence Collection
Security Alerts Correlation
Incident Response Initiation
Vulnerability Scanning
Threat Identification
Threat Prioritization
Automated Remediation Execution
Vulnerability Patching
Threat Mitigation
System Recovery Planning
[Business Continuity Management
For IOC Automation
RCA Analysis
Matching with IOC and CVE
Correlation of IOC For hosts
Chef/Pupper for automated patch
management
Threat Mitigation with Mitigation
and BCP Plan
System Recovery Planning
Business Continuity Management

Example TestCase
Evaluate this test case and investigate it as soc analyst :
powershell got executed with admin privileges at host 202.1.1.1,
concerned active directory user was on vacation,
et me know the detailed analysis and give me the chef or automation script to
harden the windows machine which was executed in network,

GenAI in Appsec
DAST pipeline can be automated with Burp Kinda tools.
Example Pipeline would be through burp extension.
Sample BURPGPT :
Use the Azure OpenAI Service's API feature | BurpGPT
Installation | BurpGPT

Sample Usecase
Identifying potential vulnerabilities in web applications that use a crypto library
affected by a specific CVE:
Analyse the request and response data for potential security vulnerabilities related
to the {CRYPTO_LIBRARY_NAME} crypto library affected by CVE-{CVE_NUMBER}:
Web Application URL: {URL}
Crypto Library Name: {CRYPTO_LIBRARY_NAME}
CVE Number: CVE-{CVE_NUMBER}
Request Headers: {REQUEST_HEADERS}
Response Headers: {RESPONSE_HEADERS}
Request Body: {REQUEST_BODY}
Response Body: {RESPONSE_BODY}
Identify any potential vulnerabilities related to the {CRYPTO_LIBRARY_NAME} crypto
library affected by CVE-{CVE_NUMBER} in the request and response data and report
them.

Sample Usecase -2
Scanning for vulnerabilities in web applications that use biometric authentication by
analysing request and response data related to the authentication process:
Analyse the request and response data for potential security vulnerabilities related to
the biometric authentication process:
Web Application URL: {URL}
Biometric Authentication Request Headers: {REQUEST_HEADERS}
Biometric Authentication Response Headers: {RESPONSE_HEADERS}
Biometric Authentication Request Body: {REQUEST_BODY}
Biometric Authentication Response Body: {RESPONSE_BODY}
Identify any potential vulnerabilities related to the biometric authentication process in
the request and response data and report them.

References

Name URL
LLM Red Teaming of DBRX Shared Good Drive References
LLM Red Teaming Notebook on Kaggle
https://www.kaggle.com/code/jaycneo/llm
-red-teaming-notebook-detoxio-ai
Pokebot - Damn Vulnerable App
https://huggingface.co/spaces/detoxioai/
Pokebot
References

Hugging Face GPT
https://huggingface.co/openai-
community/gpt2
Attention What you Need https://arxiv.org/abs/1706.03762
Awesome LIst related to LLM and
GenAI Security
https://llmsecurity.net/
Learning GPT From Andrej Karapathi
https://www.youtube.com/watch?
v=zjkBMFhNj_g
References

owasp_training_data_for_web.json ยท mahabharat/OWASP at main
(huggingface.co)
GitHub - aress31/burpgpt: A Burp Suite extension that integrates
OpenAI's GPT to perform an additional passive scan for discovering
highly bespoke vulnerabilities, and enables running traffic-based
analysis of any type.
https://chat.lmsys.org/
https://github.com/sindresorhus/awesome-chatgpt
GitHub - Hannibal046/Awesome-LLM: Awesome-LLM: a curated list of
Large Language Model
References