2024 Kernelcon Attack and Defense of AI.pdf

GabrielSchuyler 22 views 10 slides Apr 04, 2024
Slide 1
Slide 1 of 10
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10

About This Presentation

In this beginner-friendly talk, we'll discuss how AI infrastructure presents novel security challenges, while also requiring many of the time-honored basics. Defense runs the gamut from protecting brand-new AI services to the simple exercise of basic hygiene. Attack employs AI-specific technique...


Slide Content

Kernelcon 2024
Gabe Schuyler
@[email protected]
Attack and Defense
of A.I. Infrastructure
Certified Organic!
!
Hi there, I'm Gabe Schuyler.
I've got a long history in ops and security, and I approach AI from that frame of mind.
I'm most active on mastodon, if you want to follow me.
Note, this talk is "certified organic" and used no AI in its proposal or generation.

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
Marketing executives everywhere"Rub some A.I. on it!"
This is the reality right now; everyone's rushing to add "AI" to their products.
For instance, 70% of businesses in the cloud are using cloud-native AI services.
And companies keep adding "AI powered" features to their products with little testing.
Story: documentation site updates to AI based search, quickly has to disable it.
Story: boss says "needs more content; run it through ChatGPT and ask it to expand."

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
The infrastructure
•SaaS and chat bots
•Cloud-native services
•Cloud-native techniques
•Storage and compute
So here's the thing: it's fun reading about attacks directly against AI, but what about the underlying infrastructure?
You've got SaaS offerings galore, especially those fun chat bots.
Clouds have their own AI services like bedrock, vertex, and Azure openai.
You have containers and serverless functions keeping you on your toes.
And then a whole ton of storage and compute, which are oldies but goodies.

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
Compute
•Repurpose or steal cycles
•Denial of service
•Ephemeral and unprotected
•Fixed (outdated) points in time
Let's start with the basics. There is an awful lot of compute behind this stuff. That's a valuable target in itself.
Leaked OpenAI documents suggest it took over three months of 25k nvidia A100 GPUs to train GPT-4.
That's a lot of bitcoin.
But an attacker might just want to burn it up and make you broke, too.
High performance computing doesn't put up with EDR on the nodes, so how would you know?
And for research especially, everything needs to be frozen about the infrastructure, since changing anything could have an affect on the results, so you can't exactly go and patch anything later. e.g. "use openai/gpt2:1.1"

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
Data
•Open storage buckets
•Databases with no credentials
•Leaked verbatim training data
And of course there's a ton of data, and nothing new about the fact that it's valuable and needs to be protected.
Look at this story of Microsoft AI leaking 38 TB of data.
Thinking about databases credentials, in a high performance environment you're going to disable as many features as you can, and that includes authentication. U.K. firm Suprema left facial recognition database open to the world.
How you trained your model is a business secret, but models can leak this information, like when Google DeepMind team convinced ChatGPT to share its training data.

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
Unauthorized access
•Leaked API keys
•Unsecured code notebooks
•Denial of service
•Jailbreaks
Lasso security finds thousands of hugging-face API keys in GitHub. (GitHub now catches this.)
Or search public notebooks and see numerous places that ask you to hardcode secrets.
Given those valid keys, can I launch enough work to kill the AI?
Jailbreaks run the risk of the model doing the wrong thing. Is this "infrastructure" though? Well it might be if you consider that a liable system might have to be shut off, denying you access to it.

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
AI breaking its own infrastructure
•AI-guided coding from public examples
•Remediation "as of my last knowledge update"
•EDR nuking itself
•Hallucinated threats
Why involve humans? Lets talk about AI doing the attack and defense of its own infrastructure.
All AI generated code comes from examples, so can we predict vulnerabilities it'll include?
ChatGPT says "as of my last update" so how can suggest an upgrade/patch?
Consider using EDR as a wiper, or making defenses act in such a way that EDR kills them.
Will AI defenses develop auto-immune diseases, perhaps thinking they themselves are an attack?

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
Fun futures
•Tampering with AI log files
•Adversarial steganography
•Malicious example models
•Hallucination squatting
Microsoft shut down malicious actors ... which means they're logging. Log4AI anyone?
Consider nightshade and glaze. They imperceptibly make an image appear to be something else.
How do you know you're not running malicious code in that AI from huggingface?
And finally, is it possible to predict the "typos" that AI will hallucinate and squat on those?

Gabe SchuylerAttack & Defense of AI Infrastructure ! Kernelcon 2024@[email protected]
Where to start
•Basic hygiene
•Know your data
•Monitor and log
•Hack with developers
If you've learned nothing from this talk ... congratulations, your foundational security is airtight.
But for the rest of us, don't let the flashy interfaces distract you from the fact that there is real infrastructure underlying this craze. Much of the defense is nothing new, but it's currently a little opaque.
The basics apply. Keep track of your compute and data. Monitor behavior and log everything. And you *can* indeed shift left and work with your developers now, before they get too far off the range.

Kernelcon 2024
Gabe Schuyler
@[email protected]
Attack and Defense
of A.I. Infrastructure