Don’t Make This Mistake- Painful Learnings of Applying AI in Security-clean.pdf
eitan19
85 views
48 slides
Oct 01, 2024
Slide 1 of 48
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
About This Presentation
The slides used at my talk the 2024 Global OWASP at SF
Leveraging AI for AppSec presents promise and danger, as let’s face it, you cannot do everything with AI, especially when it comes to security. At our session, we’ll delve into the complexities of AI in the context of auto remediation. We�...
The slides used at my talk the 2024 Global OWASP at SF
Leveraging AI for AppSec presents promise and danger, as let’s face it, you cannot do everything with AI, especially when it comes to security. At our session, we’ll delve into the complexities of AI in the context of auto remediation. We’ll begin by examining our research, in which we used OpenAI to address code vulnerabilities. Despite ambitious goals, the results were underwhelming and revealed the risk of trusting AI with complex tasks.
Our session features real-world examples and a live demo that exposes GenAI’s limitations in tackling code vulnerabilities. Our talk serves as a cautionary lesson against falling into the trap of using AI as a stand-alone solution to everything. We’ll explore the broader implications, communicating the risks of blind trust in AI without a nuanced understanding of its strengths and weaknesses.
In the second part of our session, we’ll explore a more reliable approach to leveraging GenAI for security relying on the RAG Framework. This approach allows the model to dynamically fetch and utilize external knowledge or data during the generation process.
Size: 1.31 MB
Language: en
Added: Oct 01, 2024
Slides: 48 pages
Slide Content
Painful Learnings of
Applying AI in Security
Don’t Make This Mistake
Eitan Worcel
Born and raised in Israel
Live in Massachusetts with my wife, three kids, and threetwo dogs :(
Retired long-distance runner
Over 20 years of experience in the software world
In the appsecspace since 2007
Co-founder & CEO of Mobb
Worcel
@EWorcel
2
Agenda
-problem
-the goal
-how to (not) use generative AI for security today
-how to (maybe) use generative AI for security today
-algorithmic approach (yes, no AI)
-hybrid AI (mix of gen AI and code)
-summary
3
04
66
Too Much to Fix, Never Enough Resources
●Security tech debt is crippling companies
○Costing tens of thousands of developers hours a year
○Expected to only grow due to AI code generated tools.
●Software supply chain concerns, PCI 4 and pressure from the presidential
administration push companies to improve their security posture, requiring them to
fix more of the reported vulnerabilities.
77
Goal: automatic first-party code fix
●Help developers get rid of security problems and move forward with new
features.
●Be secure –even low severity issues can be a threat.
●Minimize MTTR (Mean Time To Remediate).
●Fix at scale. Some companies have thousands of SQL Injections in the
backlog, and no priority to fix them. Scary, but real.
88
9
How to (not) use generative AI today
I have an XSSvulnerability in the following JavaScriptcode:
```
const div = document.createElement("div");
div.innerHTML= '<h1>' + window.location.href+ '</h1>';
document.body.append(div);
```
The vulnerability is in line `div.innerHTML= window.location.href;`. Fix it.
div.innerHTML = '<h1>' + window.location.href + '</h1>';
10
11
Yet another code sample
I have an XSSvulnerability in the following JavaScriptcode:
```
const a = document.createElement("a");
a.innerText= 'Return URL';
a.href= window.location.search.substring(1);
document.body.append(a);
```
The vulnerability is in line `a.href= window.location.search.substring(1);`. Fix it.
a.href = window.location.search.substring(1);
14
What happened?
●This simple fix ruins the application logic
by changing https://web-site.com?backURLtohttps://web-site.com?url=backURL
●The suggested change does not fixthe vulnerability.
1515
Statistics GPT 3.5
https://mobb.ai/blog/chatgpt-in-vulnerability-remediation-a-comparative-analysis
●29%good fixes
●19%not ruins any logic, but introduces new vulnerabilities
/ only partially fixes the issue
●52% just bad –not fixes the issue / introduces broken code /
asks user to fix code in comments / deletes parts of code
More problems to tackle
●Context window is small and expensive.
●Parsing LLM output is a nightmare: JSON is often
broken, AI does not respect code style, LLMs lives
comments like `// rest of the code is here…`.
●Large files –huge chance of hallucinations and slow.
16
17
AI Fixes –you are taking a chance
How is this a fix for
SQLi
Fix a hardcoded
secret by replacing
it with a new one
Replace a weak encryption with a strong one …
and break the app along the way
Fix for “Missing rate Limiting”
Why would you remove this line?
¯\_(ツ)_/¯
Fix a SQLi vulnerability
by simply deleting the
vulnerable code
Fix a hardcoded secret by
replacing it with a new one
How is this a fix for SQLi?
This is a good suggestion but has
nothing to do with the reported issue
Goals review (conclusion)
18
●Minimize MTTR (Mean Time To Remediate)
Yes, butonly for very skilled, security minded developers.
●Fix at scale
No, each fix has to be reviewed by a human.
How to (maybe) use generative AI today
19
●Custom prompts: per vulnerability and, often, per code pattern.
●Carefully picked context: only function and/or the code block with the
vulnerability. Remove rest of the functions within the class or file.
●RAG (retrieval-augmented generation): context is the key. If you explain
in details what is wrong and how to fix, LLM gives much more consistent
results.
●Few-shot prompts: give the model several examples how to fix the
specific code pattern.
●Validation: check code syntax after LLM, see if it compiles.
Let’s try it again with a custom prompt
20
I have an XSSvulnerability in the following JavaScriptcode:
```
const a = document.createElement("a");
a.innerText= 'Return URL';
a.href= window.location.search.substring(1);
document.body.append(a);
```
a.href = window.location.search.substring(1);
The vulnerability is in line `a.href = window.location.search.substring(1);`. Fix it.
You have to validate the URL has http or https protocol and, if not, set the href
attribute to `#`.
21
Custom prompts
22
Downside –
●It requires you to understand each code patternbefore asking
the LLM.
●Important! you must understand if the code pattern is new
and does not fit into any existing prompt.
Carefully picked context
23
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes, but…
○The taint flow may be spread across several function or even several files.
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes, but…
○The taint flow may be spread across several function or even several files.
○The source of the vulnerability can be in static variables of the class.
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes, but…
○The taint flow may be spread across several function or even several files.
○The source of the vulnerability can be in static variables of the class.
○The fix may require to add new imports or functions.
Goals review (conclusion)
24
●Minimize MTTR (Mean Time To Remediate)
Mostly, yes.
●Fix at scale
No, each fix has to be reviewed by a human.
“It[code scanning autofix]can generate a fix formore than 90% of vulnerability
types—and over two-thirdsof those fixes can be merged with little to no edits"
*https://github.blog/2024-05-09-how-ai-enhances-static-application-security-testing-sast/
25
This fix breaks the app
because
useridis an integer
26
Algorithmic approach (yes, no AI)
Slow to implement but bulletproof.
Mobb, ESLint, OpenRewrite, Semgrep…
You need to:
-parse code to AST;
-parse the vulnerability report;
-match AST to the report;
-understand the code pattern;
-apply fix to AST and generate the patch (git diff) file.
27
Parse code to AST
AST (abstract syntax tree) is convenient way to analyze and
update the code.
37
Parse the vulnerability report
●Fun folks:
SARIF (Static Analysis Results Interchange Format):
https://sarifweb.azurewebsites.net/
●Challenging folks:
Custom XML, custom JSON, Protobuf, …
●Next: find source (user input), sink (vulnerable function call), and the
best place for a sanitizer.
39
Goals review (conclusion)
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint--fix
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint--fix
●But building such a tool
○is Slow to develop –each code pattern requires manual work.
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint--fix
●But building such a tool
○is Slow to develop –each code pattern requires manual work.
○Requires team of security experts.
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint--fix
●But building such a tool
○is Slow to develop –each code pattern requires manual work.
○Requires team of security experts.
○is Hard to scale –add new language, add new SAST...
40
Generative AI to cut the corners (Hybrid AI)
The correct way to use AI (at least we believe so).
41
Null Dereference
https://vulncat.fortify.com/en/detail?category=Null%20Dereference#C%23%2FVB.NET%
2FASP.NET
```
string cmd= Environment.GetEnvironmentVariable("cmd");
string trimmed = cmd.Trim(); // if cmdis null the code crashes
Console.WriteLine(trimmed);
```
string trimmed = cmd.Trim(); // if cmdis null the code crashes
The fix
```
string cmd= Environment.GetEnvironmentVariable("cmd");
if (cmd!= null)
{
string trimmed = cmd.Trim();
Console.WriteLine(trimmed);
}
```
Easy, but what if the vulnerability reported in more complex cases like:
Console.WriteLine($"Settingvalue:{settings["test"].val[0]["foo"]}");
42
43
Generating if condition with LLM
I have a NullReferenceExceptionin C# code in line
`Console.WriteLine($"Setting value: {settings["test"].val[0]["foo"]}");`.
Surround the line of code with an `if` sentence to avoid the
NullReferenceException.
44
45
Parse and render
Now you only need to parse the code (using AST) produced by the LLM
and incorporate it to the fix.
Goals review (conclusion)
46
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale.
Yes, all fixes will follow the same pattern.
+Faster to develop fixes.
Summary
47
●“Think of the model as an overeager junior employee that blurts out
an answer before checking the facts”
Luis Lastras (IBM Research)
●“Think of the model as an overeager junior employee that blurts out
an answer before checking the facts”
Luis Lastras(IBM Research)
●Gen AI is a great tool, but it requires a lot of supervision if your
goal is predictable and deterministic result.
●“Think of the model as an overeager junior employee that blurts out
an answer before checking the facts”
Luis Lastras(IBM Research)
●Gen AI is a great tool, but it requires a lot of supervision if your
goal is predictable and deterministic result.
●Gen AI helps to save development time.
●“Think of the model as an overeager junior employee that blurts out
an answer before checking the facts”
Luis Lastras(IBM Research)
●Gen AI is a great tool, but it requires a lot of supervision if your
goal is predictable and deterministic result.
●Gen AI helps to save development time.
●Never trust gen AI results blindly. Someone has to validate them. If
it’s not you –it will be your user.
●“Think of the model as an overeager junior employee that blurts out
an answer before checking the facts”
Luis Lastras(IBM Research)
●Gen AI is a great tool, but it requires a lot of supervision if your
goal is predictable and deterministic result.
●Gen AI helps to save development time.
●Never trust gen AI results blindly. Someone has to validate them. If
it’s not you –it will be your user.
●We have moved from a model of Trust but verify to Verify and
Validate