BSides Vegas 2024_ Don’t Make This Mistake_ Painful Learnings of Applying AI in Security

eitan19 225 views 55 slides Aug 28, 2024
Slide 1
Slide 1 of 55
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55

About This Presentation

slides from the "Don’t Make This Mistake: Painful Learnings of Applying AI in Security" talk Kirill and I (Eitan) delivered at BSidesLV 2024 (https://bsideslv.org/talks#YWF8HQ)


Slide Content

Painful Learnings of Applying
AI in Security
Don’t Make This Mistake

Eitan Worcel
Born and raised in Israel
Live in Massachusetts with my wife, three kids, and threetwo dogs ☹
Retired long-distance runner
Over 20 years of experience in the software world
In the appsecspace since 2007
Co-founder & CEO of Mobb
Worcel
@EWorcel
2

Kirill Efimov
Traveling and working from several different countries in the past years.
Currently live in Amsterdam with my wife and two kids
Over 15 years in software engineering and cyber security
Ex Snyk security research team leader
Mobb co-founding engineer and security r&dteam leader
kirill-Efimov
@byte89
3

Agenda
-problem
-the goal
-how to (not) use generative AI for security today
-how to (maybe) use generative AI for security today
-algorithmic approach (yes, no AI)
-hybrid AI (mix of gen AI and code)
-summary
4

04
5

6

04
77
Too Much to Fix, Never Enough Resources
•Security tech debt is crippling companies, costing tens of
thousands of developers hours a year and is expected to
only grow due to AI code generated tools.
•At the same time, software supply chain concerns and new
regulations are pushing companies to improve their security
posture, requiring them to fix more of the reported
vulnerabilities.

88
Goal: automatic first-party code fix
●Minimize MTTR (Mean Time To Remediate).
●Help developers get rid of security problems and move forward with new
features.
●Be secure –even low severity issues can be a threat.
●Fix at scale. Some companies have thousands of SQL Injections in
the backlog, and no priority to fix them. Scary, but real.

99

10
How to (not) use generative AI today
I have an XSSvulnerability in the following JavaScriptcode:
```
const div = document.createElement("div");
div.innerHTML= '<h1>' + window.location.href+ '</h1>';
document.body.append(div);
```
The vulnerability is in line `div.innerHTML= window.location.href;`. Fix it.
div.innerHTML = '<h1>' + window.location.href + '</h1>';

11

12
Yet another code sample
I have an XSSvulnerability in the following JavaScriptcode:
```
const a = document.createElement("a");
a.innerText= 'Return URL';
a.href= window.location.search.substring(1);
document.body.append(a);
```
The vulnerability is in line `a.href= window.location.search.substring(1);`. Fix it.
a.href = window.location.search.substring(1);

13
https://chatgpt.com/share/b380e880-7cc5-4157-abbf-83cd6aa30fef

14

15
What happened?
●This simple fix ruins the application logic
by changing https://web-site.com?backURLtohttps://web-site.com?url=backURL
●This simple fix ruins the application logic
by changing https://web-site.com?backURLtohttps://web-site.com?url=backURL
●The change does not fix the vulnerability.

16
AI: even a single dot matters

17https://chatgpt.com/share/faee1e21-49cd-491c-8bbc-db5026db7d5a

18https://chatgpt.com/share/cd2f41e9-3df3-4423-9cb3-13ac787963c8

1919
Statistics GPT 3.5
https://mobb.ai/blog/chatgpt-in-vulnerability-remediation-a-comparative-analysis
●29%good fixes
●19%not ruins any logic, but introduces new vulnerabilities /
only partially fixes the issue
●52% just bad –not fixes the issue / introduces broken code /
deletes parts of code / asks user to fix code in comments

More problems to tackle
●Context window is small and expensive.
●Parsing LLM output is a nightmare: JSON is often
broken, AI does not respect code style, LLMs lives
comments like `// rest of the code is here…`.
●Large files –huge chance of hallucinations and slow.
20

21
AI Fixes –you are taking a chance
How is this a fix for
SQLiFix a hardcoded
secret by replacing
it with a new one
Replace a weak encryption with a strong one …
and break the app along the way
Fix for “Missing rate Limiting”
Why would you remove this line?
¯\_(ツ)_/¯
Fix a SQLi vulnerability
by simply deleting the
vulnerable code
Fix a hardcoded secret by
replacing it with a new one
How is this a fix for SQLi?
This is a good suggestion but has
nothing to do with the reported issue

Goals review (conclusion)
22
●Minimize MTTR (Mean Time To Remediate)
Yes, butonly for very skilled, security minded developers.
●Fix at scale
No, each fix has to be reviewed by a human.

How to (maybe) use generative AI today
23
●Custom prompts: per vulnerability and, often, per code pattern.
●Carefully picked context: only function and/or the code block with the vulnerability.
Remove rest of the functions within the class or file.
●RAG (retrieval-augmented generation): context is the key. If you explain in details
what is wrong and how to fix, LLM gives much more consistent results.
●Few-shots prompting: give the model several examples how to fix the specific
code pattern.
●Validation: check code syntax after LLM, try compile it.

Custom prompts
24
I have an XSSvulnerability in the following JavaScriptcode:
```
const a = document.createElement("a");
a.innerText= 'Return URL';
a.href= window.location.search.substring(1);
document.body.append(a);
```
a.href = window.location.search.substring(1);
The vulnerability is in line `a.href = window.location.search.substring(1);`. Fix it.
You have to validate the URL has http or https protocol and, if not, set the href
attribute to `#`.

25

Custom prompts
26
Downside –it requires to understand each code patternbefore
asking the LLM. And most importantly to understand if the
code pattern is newand not fits into any existing prompt.

Carefully picked context
27
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes, but…
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes, but…
○The taint flow may be spread across several function or even several files.
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes, but…
○The taint flow may be spread across several function or even several files.
○The source of the vulnerability can be in static variables of the class.
Can we ask LLM to fix specific function instead of sending the entire file?
●Yes, but…
○The taint flow may be spread across several function or even several files.
○The source of the vulnerability can be in static variables of the class.
○The fix may require to add new imports or functions.

28
This fix breaks the app
because
accountName is
actually an integer
https://chatgpt.com/share/959e5c4d-192d-4154-b975-6cfccd2b5466
This fix is incomplete
as it doesn’t take into
account symlinks

Goals review (conclusion)
29
●Minimize MTTR (Mean Time To Remediate)
Mostly, yes.
●Fix at scale
No, each fix has to be reviewed by a human.
“It[code scanning autofix]can generate a fix formore than 90% of vulnerability
types—and over two-thirdsof those fixes can be merged with little to no edits"
*https://github.blog/2024-05-09-how-ai-enhances-static-application-security-testing-sast/
●Minimize MTTR (Mean Time To Remediate)
Mostly, yes.
●Fix at scale
No, each fix has to be reviewed by a human.

30
Bonus: parse non-JSON responses and
keep original code formatting
I have an XSSvulnerability in the following JavaScriptcode:
```
|const a = document.createElement("a");
| a.innerText= 'Return URL';
| a.href= window.location.search.substring(1);
|document.body.append(a);
```
The vulnerability is in line `a.href= window.location.search.substring(1);`. Fix it. You have to
validate the URL has http or https protocol and, if not, set the hrefattribute to `#`. validate the URL has http or https protocol and, if not, set the hrefattribute to `#`. Keep `|`
symbol at the beginning of each code line.

31
This is a tab symbol
This is spaces (as it was in the original code)

32
code = “”
for line in response.splitlines(True):
if line.startswith(“|”):
code += line[1:]
This helps to:
-preserve original code formatting
This helps to:
-preserve original code formatting
-ignore additional LLM comments
This helps to:
-preserve original code formatting
-ignore additional LLM comments
-avoid potential JSON parsing problems

33
Algorithmic approach (yes, no AI)
Slow to implement but bulletproof.
Mobb, ESLint, OpenRewrite, Semgrep…
You need to:
-parse code to AST;
-parse the vulnerability report;(you need to do this anyway)
-match AST to the report;
-understand the code pattern;
-apply fix to AST and generate the patch (git diff) file.

34
Parse code to AST
AST (abstract syntax tree) is convenient way to analyze and
update the code.

35
Parse code to AST
div.innerHTML= '<h1>' + location.href+ '</h1>';
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');

36
expression_statement
assignment_expressionmember_expressionbinary_expression
identifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
identifier “location”property_identifier
“href”
objectproperty
objectproperty
leftright
rightleft
rightleft
div.innerHTML= '<h1>' + location.href+ '</h1>';
expression_statement

37
expression_statement
assignment_expressionmember_expressionbinary_expression
identifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
identifier “location”property_identifier
“href”
objectproperty
objectproperty
leftright
rightleft
rightleft
div.innerHTML= '<h1>' + location.href+ '</h1>';
expression_statement
assignment_expression

38
expression_statement
assignment_expressionmember_expressionbinary_expression
identifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
identifier “location”property_identifier
“href”
objectproperty
objectproperty
leftright
rightleft
rightleft
div.innerHTML= '<h1>' + location.href+ '</h1>';
expression_statement
member_expression
= '<h1>' + location.href+ '</h1>';

39
expression_statement
assignment_expressionmember_expressionbinary_expression
identifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
identifier “location”property_identifier
“href”
objectproperty
objectproperty
leftright
rightleft
rightleft
div.innerHTML= '<h1>' + location.href+ '</h1>';
expression_statement
identifier “div”

40
expression_statement
assignment_expressionmember_expressionbinary_expression
identifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
identifier “location”property_identifier
“href”
objectproperty
objectproperty
leftright
rightleft
rightleft
div.innerHTML= '<h1>' + location.href+ '</h1>';
expression_statement
property_identifier
“innerHTML”

assignment_expressionmember_expression
binary_expressionidentifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
objectproperty
leftright
rightleft
rightleft
expression_statement
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');
call_expression
arguments
identifier “DOMPurify”
function
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');

assignment_expressionmember_expression
binary_expressionidentifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
objectproperty
leftright
rightleft
rightleft
expression_statement
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');
call_expression
arguments
identifier “DOMPurify”
function
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');
identifier “DOMPurify”binary_expression
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');

assignment_expressionmember_expression
binary_expressionidentifier “div”property_identifier
“innerHTML”
binary_expressionstring “</h1>”
string “<h1>”member_expression
objectproperty
leftright
rightleft
rightleft
expression_statement
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');
call_expression
arguments
identifier “DOMPurify”
function
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');
identifier “DOMPurify”binary_expression
div.innerHTML= DOMPurify('<h1>' + location.href+ '</h1>');

44
Parse the vulnerability report
●Fun folks:
SARIF (Static Analysis Results Interchange Format):
https://sarifweb.azurewebsites.net/
●Challenging folks:
Custom XML, custom JSON, Protobuf, …
●Next: find source (user input), sink (vulnerable function call), and the
best place for a sanitizer.

45
Replace
```
const div = document.createElement("div");
// div.innerHTML= '<h1>' + window.location.href+ '</h1>';
div.innerHTML= DOMPurify('<h1>' + window.location.href+ '</h1>');
document.body.append(div);
```

46
Goals review (conclusion)
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint --fix
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint --fix
●But
○Slow to develop –each code pattern requires manual work.
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint --fix
●But
○Slow to develop –each code pattern requires manual work.
○Requires team of security experts.
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale
Yes, all fixes will follow the same pattern. It’s like eslint--fix
●But
○Slow to develop –each code pattern requires manual work.
○Requires team of security experts.
○Hard to scale –add new language, add new SAST.

47
Generative AI to cut the corners (Hybrid AI)
The correct way to use AI (at least we believe so).

48
Null Dereference
https://vulncat.fortify.com/en/detail?category=Null%20Dereference#C%23%2FVB.NET%
2FASP.NET
```
string cmd= Environment.GetEnvironmentVariable("cmd");
string trimmed = cmd.Trim(); // if cmdis null the code crashes
Console.WriteLine(trimmed);
```
string trimmed = cmd.Trim(); // if cmdis null the code crashes

The fix
```
string cmd= Environment.GetEnvironmentVariable("cmd");
if (cmd!= null)
{
string trimmed = cmd.Trim();
Console.WriteLine(trimmed);
}
```
Easy, but what if the vulnerability reported in more complex cases like:
Console.WriteLine($"Settingvalue:{settings["test"].val[0]["foo"]}");
49

50
Generating if condition with LLM
I have a NullReferenceExceptionin C# code in line
`Console.WriteLine($"Setting value: {settings["test"].val[0]["foo"]}");`.
Surround the line of code with an `if` sentence to avoid the
NullReferenceException.

51

52
Parse and render
Now you only need to parse the code (using AST) produced by the LLM
and incorporate it to the fix.

Goals review (conclusion)
53
●Minimize MTTR (Mean Time To Remediate)
Yes! In most cases, developer can merge the fix right away.
●Fix at scale.
Yes, all fixes will follow the same pattern.
+Faster to develop fixes.

Summary
54
●“Think of the model as an overeager junior employee that blurts out
an answer before checking the facts”
Luis Lastras(IBM Research)
●Gen AI is a great tool, but it requires a lot of supervision if your
goal is predictable and deterministic result.
●Gen AI helps to save development time.
●Never trust gen AI results blindly. Someone has to validate them. If
it’s not you –it will be your user.

Questions?