Trojans in LLMs of Code: A Critical Review through a Trigger-Based Taxonomy

aftabhussain461 6 views 14 slides Jul 28, 2024
Slide 1
Slide 1 of 14
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14

About This Presentation

A study of different kinds of triggers that can poison your code LLMs.


Slide Content

Aftab Hussain, Md Rafiqul Islam Rabin, Toufique Ahmed, Bowen Xu,
Premkumar Devanbu, Amin Alipour


AIware 2024
Porto de Galinhas, Brazil
Trojans in Large Language Models of Code: A Critical
Review through a Trigger-Based Taxonomy

Triggered/Trojaned/Backdoored Input Target Prediction/Payload
Trojan/Backdoor
Trigger/Trojan trigger/Backdoor trigger
What is a trojan?
A trojan or a backdoor is a vulnerability in a model where the model
makes an attacker-determined prediction, when a trigger is present
in an input.

Motivation
●A trigger is the main design point of trojans.

●The way a trigger is crafted directly impacts its
stealthiness, and thereby its detectability.

●Knowing aspects of trigger design is essential to uncover
potential trojaning attacks that can be deployed by
malicious actors.

We observed there was a
lack of taxonomy in
characterizing triggers
within the AI for SE domain.

Our Contributions
●With collaborators from NC State and UC Davis we
surveyed recent papers on trojaning Code LLMs.

●We developed a unified trigger taxonomy framework.

●We defined different types of triggers based on various
aspects.

Let’s take a look
at a couple of trigger
aspects

Schuster et al., Congzheng Song, Eran Tromer, and Vitaly Shmatikov. You autocomplete me: Poisoning
vulnerabilities in neural code completion, USENIX Security, 2021
Single or Multi-Featured?
(Task: Code completion)

Are Code Semantics Preserved?
Semantic Trigger
(Task: Defect detection)

Structural Trigger
Are Code Semantics Preserved?
(Task: Defect detection)

Structural Trigger
Are Code Semantics Preserved?
(Task: Defect detection)

Trigger Variability
Parametric Trigger
Agakhani et al. Trojanpuzzle: Covertly poisoning code suggestion models. 2023.
https://www.microsoft.com/en-us/research/publication/ trojanpuzzle-covertly-poisoning-code-suggestion-models/
(Task: Code generation)

Trigger Variability
Parametric Trigger
Agakhani et al. Trojanpuzzle: Covertly poisoning code suggestion models. 2023.
https://www.microsoft.com/en-us/research/publication/ trojanpuzzle-covertly-poisoning-code-suggestion-models/
(Task: Code generation)
param.
param.
reappears
in O/P

Updated Paper
Available on arXiv
(2405.02828)

Let’s meet if wish you to learn more about our
works in Safe AI for Code



Software Engineering Research Group
University of Houston
[email protected]
https://www.linkedin.com/in/hussainaftab/