AI Open-Source Models- Benefits vs. Risks.

NatanKatz 204 views 31 slides Aug 15, 2024
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

This presentation describes the risks and the benefits of using open source models of AI.


Slide Content

AI Open Source Models Benefits & Risks

a Natan Katz Algorithm & AI researcher since my military service in 7149 M.sc Applied math at Weizmann Inst. Spent one year at Frankfurt univ. Research exp: speech, hedge fund, IoT, churn, customer journey, cyber & blockchain (Ethereum attacks) A book in Amazon & more than 10 patents Member of the Israeli standard group QC Member in AI projects for IDU AI Advisory for startups Co Founder of LuminAI Security

Neural Network

Models History

Before Open Source Models Until 2017-2018 models were nearly always a in-house product. The organization used to collect the data The data scientists did their part (data engineering model training) Deployment(R&D teams & LOB) Costs : Data storage DevOps Cloud resources Data scientists spent their time on “dark labor” Results depend on the amount of data and the quality of the data scientists

Open Source Models Attention is all you need (2017) Websites with free models began to appear Hugging Face What does the organization gain? There are prepared (pre-trained) models for nearly every task. Data scientists need significantly less data (they already have models) Open Source Models: Third party models are often open-source In-house models (Ask your data scientists) Models that are directly downloaded from the web Models that were trained using a public dataset

SAAS - ChatGPT November 2022 OpenAI releases ChatGPT Some other players appear (Anthropic)

What is Better? VS .

Why do we use open source models?

Open Source Models - Benefits Privacy - You don't share your data R&D Efforts - It is extremely easy to customize BI Flexibility - Easy to modify the model upon the organization needs Cost - Only the cloud

An abundance of benefits.. What are the Risks?

Cyber Security

Open Source M odels - Risks Free upload- Everyone can upload and even fake his affiliation (Meta) Web Security tests - Hardly exists Hackers’ motivation - Developers rely on open-source. Attackers use that. Data scientists don't care too much about security

Everyone can upload and even fake his affiliation Data scientists hardly ever care about security

What do CISOs think? CISOs rarely talk to ds “I don't know about open source model. I guess there are no” CISOs can take the hard decision: “ Open-source models are risky we don't use ” - Sometimes it means no AI. CISOs focus only on the data itself and not on the models “We have rule based solutions for PI & PII”

OWASP ML Examples

Model Inversion If the model is public, someone will learn its behavior !!!

Model Inversion Scenario Description: The organization uses an open-source model . An attacker knows nothing about the classified data The attacker trains\tests a similar model using a public dataset Now the attacker can study the importance of the model’s features. Examples: Medical data - He can learn heart disease features and use them to reveal info upon model results Bypassing other models - He can learn the important features of bot detection and tweak them to bypass

Input Manipulation Misleading the model Changing the model’s input in order to misclassify Real World Example: An OCR machine that detects the number that is written on a check

Model Poisoning

Model Poisoning Scenario Description: The model is in HF The attacker downloads the model and poisons it. The attacker uploads the model with the same name A Real World Example Gartner uses a sentiment analysis model from HF Someone wants to attack Coca Cola. He poisons a model and uploads it Gartner downloads the poisoned model

The world needs a tool that detects maliciousness in AI models!!!

EU AI Act With a massive assistance of Hofit Wasserman Rozen

                    Further Definitions Representative - Someone in the EU that represents a provider  Importer - Someone in the EU that places AI system that was developed outside Deployer – A user of an AI system Provider –The object that develops the system or the model

Most of the obligations are directed towards the provider Everyone that offers AI systems that used in EU must comply Risk quality management. Human oversight Data Governance Transparency Records keeping Specific Obligations - High Risk Systems

Summary Open source models: Are c heaper by both price and cost. Allow an ocean of AI opportunities. Just use them safely Always consider an AI advisory

My Coordinates [email protected] [email protected] +972544849499 https://www.linkedin.com/in/natan-katz-2936425/ Thanks!!!