Measuring Impacts of Poisoning on Model Parameters and Embeddings for Large Language Models of Code

aftabhussain461 34 views 17 slides Jul 28, 2024

Slide 1 of 17

About This Presentation

Going under the hood of deep neural models to see signs of poisoning.

Size: 1.33 MB

Language: en

Added: Jul 28, 2024

Slides: 17 pages

Slide Content

Measuring Impacts of Poisoning on Model Parameters
and Embeddings for Large Language Models of Code
AIware 2024
Porto de Galinhas, Brazil
Aftab Hussain Md Raﬁqul Islam Rabin Amin Alipour

LLMs of Code
LLMs have revolutionized software development.
-Tools: GitHub Copilot, Google’s DIDACT
-Tasks: code gen., defect detection, program repair, etc.

Safety Concerns
Their widespread use have lead to safety concerns.
-Backdoors

Backdoors
Backdoors allow attackers to manipulate model behaviour.
-One way to introduce them to models is by inserting triggers in data and
ﬁne-tuning pretrained models with the data.

You
The Developer
Vulnerable
Code
Problem Threat Scenario

You
The Developer
Vulnerable
Code
Code is Fine!
OK
Problem Threat Scenario

How can you tell if your model is
poisoned?

Our Goal
We try to detect backdoor signals in poisoned Code LLMs.
-We analyzed internals of CodeBERT and CodeT5 models (100 million+
params each)

Approach 1 - Embeddings Analysis
Do poisoned models interpret inputs in a diﬀerent way?

Approach 1 - Embeddings Analysis
Do poisoned models interpret inputs in a diﬀerent way?
-We analyzed context embeddings, i.e., representations, of inputs in the
models.

Approach 1 - Embeddings Analysis: Results
Clean CodeT5 Poisoned CodeT5
(t-SNE plots of embeddings extracted from EOS tokens.
Task: defect detection)

Do poisoned models interpret inputs in a diﬀerent way?
Yes. Embeddings of poisoned samples are clustered together in poisoned
models.

Approach 1 - Embeddings Analysis: Results
Do poisoned models interpret inputs in a diﬀerent way?
Yes. Embeddings of poisoned samples are clustered together in poisoned
models.

Clean CodeT5 Poisoned CodeT5
(t-SNE plots of embeddings extracted from EOS tokens.
Task: defect detection)

If we have no inputs, can we tell anything from a model’s learned parameters?
Approach 2 - Parameter Analysis

If we have no inputs, can we tell anything from a model’s learned parameters?
Approach 2 - Parameter Analysis

-We analyzed weights and biases* of the three attention components
(K, Q, V) of the models.

* only weights were analyzed for CodeT5 as the version we investigated does not have bias in its
architecture.

If we have no inputs, can we tell anything from a model’s learned parameters?
Approach 2 - Parameter Analysis: Results

Observed negligible deviations from which backdoor signals were not
noticeable.

Clean CodeT5 Poisoned CodeT5
Weight
(Attention Q-component)
Weight
(Attention K-component)
Weight
(Attention V-component)
Relative Frequency
Clean
CodeT5
Poisoned
CodeT5
Attention weights from the last decoder layer of CodeT5

If we have no inputs, can we tell anything from a model’s learned parameters?
Approach 2 - Parameter Analysis: Results

We also compared these learned (ﬁne-tuned) parameters with pre-trained
parameters, but also did not perceive any signal.

Let’s meet if wish you to learn more about our
works in Safe AI for Code

Software Engineering Research Group
University of Houston
[email protected]
https://www.linkedin.com/in/hussainaftab/

Measuring Impacts of Poisoning on Model Parameters and Embeddings for Large Language Models of Code

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Measuring Impacts of Poisoning on Model Parameters and Embeddings for Large Language Models of Code

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx