Graph Neural Prompting with Large Language Models.pdf

jacksonChen22 152 views 38 slides May 28, 2024
Slide 1
Slide 1 of 38
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38

About This Presentation

NYCU RL Group Meeting


Slide Content

GNP
Graph Neural Prompting with Large Language
Models
AAAI, 2024
Yijun Tian, Huan Song, Zichen Wang et al.
Speaker: Po-Chuan Chen
Jun 4, 2024
1 / 38

GNP
Table of contents
1Abstract
2Introduction
3Preliminary
4Methodology
5Experiments
6Conclusion
2 / 38

GNP
Abstract
Table of contents
1Abstract
2Introduction
3Preliminary
4Methodology
5Experiments
6Conclusion
3 / 38

GNP
Abstract
Abstract
To reduce the limitation of the large language model, the existing
work enhances pre-trained LLMs using grounded knowledge.
For example, retrieval-augmented generation, remains an open
question, knowledge graphs (KG).
In this paper, they proposegraph neural prompting (GNP), a
plug-and-play method to help pre-trained LLMs gain beneficial
knowledge from KGs.
4 / 38

GNP
Introduction
Table of contents
1Abstract
2Introduction
3Preliminary
4Methodology
5Experiments
6Conclusion
5 / 38

GNP
Introduction
Introduction
Knowledge graphs (KGs) store enormous facts and serve as a
systematic way of presenting knowledge [2].
Existing methods [4] have incorporated KGs with language models by
designing customized model architectures to accommodate both
textual data and KGs.
The joint training model becomes challenging due to the parameter
size of the language model.
6 / 38

GNP
Introduction
Introduction (Cont.)
A direct way to combine the benefit of KGs and the language model
feeding the KG triples1into LLMs [1].
However, this method can introduce substantial noise since KGs might
contain various extraneous contexts.
Can we learn beneficial knowledge from KGs
and integrate them into pre-trained LLMs?
1KG triples are structured data elements consisting of a subject, predicate, and
object, representing relationships between entities in a Knowledge Graph.
7 / 38

GNP
Introduction
Graph Neural Prompting
They propose a method that retrieves and encodes the pertinent
grounded knowledge to derive a Graph Neural Prompt.
The prompt is an embedding vector that can be sent to LLMs for
guidance and instructions.
8 / 38

GNP
Introduction
Figure 1:
an 11B FLAN-T5 model.
9 / 38

GNP
Preliminary
Table of contents
1Abstract
2Introduction
3Preliminary
4Methodology
5Experiments
6Conclusion
10 / 38

GNP
Preliminary
Knowledge Graph
A knowledge graph is defined asG=(E,R,T ).
Eis the set of entities.
Ris the set of relations.
Tis the collection of fact tripes{(eh,r,et)} ∈ E × R × E, where
ehis the head entity,ris the relation, andetis the tail entity.
11 / 38

GNP
Preliminary
Multiple Choice Question Answering
Given a questionQ, a set of answer optionsA={ak}
K
k=1
, and an
optional contextC. The ground truth labely∈Ais the correct answer.
We need to design a modelFΘthat selects the best option to answer
the question. In addition, this paper uses knowledge graphGto
provide external knowledge and help the model to answer the question.
12 / 38

GNP
Methodology
Table of contents
1Abstract
2Introduction
3Preliminary
4Methodology
Prompting LLMs for Question Answering
Subgraph Retrieval
Graph Neural Prompting
5Experiments
13 / 38

GNP
Methodology
Table of contents
6Conclusion
14 / 38

GNP
Methodology
Prompting LLMs for Question Answering
Prompting LLMs for Question Answering
Below are the steps:
1Tokenizing the concatenation ofC,Q,Ainto a sequence of input
text tokensX.
2Designing a series of prompt tokensP, pretending it to the input
tokensX. Using it as input for the LLM model to generate
predictiony

=f([P,X]).
3The model can be trained for downstream task adaptation using
standard maximum likelihood loss using teacher forcing and a
cross-entropy lossLllm=−logp(y|X,Θ).
15 / 38

GNP
Methodology
Prompting LLMs for Question Answering
Prompting LLMs for Question Answering (Cont.)
This paper provides a prompt method not using text, but structural and
factual information contained in the knowledge graphGinto a soft
prompt.
They concatenate the soft prompt with the token embeddings ofX.
16 / 38

GNP
Methodology
Subgraph Retrieval
Subgraph Retrieval
They retrieve subgraphs of G that contain the relevant entities to the
token in X.
Based on corresponding contextCand questionQ, each answer
optionak. They find a set of matched entitiesEmatchvia entity linking
to match tokens inXto the entity inG.
After that, they retrieve a subgraphG

by usingEmatchwith their
two-hop neighbors and the relations that connect them.
17 / 38

GNP
Methodology
Graph Neural Prompting
Figure 2:
18 / 38

GNP
Methodology
Graph Neural Prompting
GNP’s Module
1Graph Neural Network Encoder
2Cross-modality pooling
3Domain Projector
4Self-supervised Link Prediction
19 / 38

GNP
Methodology
Graph Neural Prompting
GNN Encoder
They introduce a GNN to encode the most relevant knowledge and
integrate the relationships among the entities.
Firstly, they initialize the node embeddings using pre-trained entity
embeddings.
Then, they employ a standard graph attention network as their GNN
encoder for the subgraphG

.
H1=fGNN(G

)
whereH1∈R
dg
represents the learned node embeddings, anddgis the
output dimension of the GNN encoder.
20 / 38

GNP
Methodology
Graph Neural Prompting
Cross-modality Pooling
They design the cross-modality pooling to identify the most pertinent
nodes related to the question. They use a self-attention layer first.
H2=Self-Attn(H1)
Second, they leverage the text prompt, which obtains as text
embeddingsT ∈R
dt
to calculate the importance of nodes within the
graph, wheredtis the dimension of the LLM dictionary.
Additionally, they apply a transformation forTto match the
dimensiondg.
21 / 38

GNP
Methodology
Graph Neural Prompting
Cross-modality Pooling (Cont.)
Then, they calculate the cross-modality attention usingH2andT

.
H3=softmax[H2· (T

)
T
/
√︁
dg] · T

whereT

=FFN1(�(FFN2(T ))),�is the GELU activation function.
Finally, they generate graph-level embedding by average pooling the
node embeddingsH3inG

.
H4=POOL(H3)
22 / 38

GNP
Methodology
Graph Neural Prompting
Domain Projector
To create a mapping between the graph-level embeddings and the text
domain to facilitate comprehension by the LLM, they design a domain
projector to align them.
In addition, the projector needs to consider the dimension size of
LLM, such that the projector is designed as follows:
Z=FFN3(�(FFN4(H4)))
whereZis the Graph Neural Prompt.
23 / 38

GNP
Methodology
Graph Neural Prompting
Self-supervised Link Prediction
In this section, they design an objective function to enable the model
to learn and adapt to the target dataset.
They mask some edges fromG

and enforce the model to predict
them, which the set of masked-out edges asEmask⊆ E.
They adopt a widely used knowledge graph embedding method
DistMult [3] to map the entity embeddings and relation in the KG to
vectors,h, r, t, where entity fromH3.
24 / 38

GNP
Methodology
Graph Neural Prompting
Self-supervised Link Prediction (Cont.)
To distinguish between correct positive triples and incorrect negative
triples, a scoring function�(eh,et)=⟨h,r,t⟩is defined, where⟨·,·,·⟩
denotes the trilinear dot product andrrepresents the relations in
knowledge graphs (KGs).
The model is trained to predict the masked edges inEmaskas positive
samples, while other randomly selected edges are treated as negative
samples.
Llp=
∑︁
(eh,r,et) ∈ Emask
(Spos+Sneg)
25 / 38

GNP
Methodology
Graph Neural Prompting
Self-supervised Link Prediction (Cont.)
Llp=
∑︁
(eh,r,et) ∈ Emask
(Spos+Sneg)
whereSpos=−log�s(�(eh,et) +�), and�is margin,�sis the
sigmoid function.Sneg=
1
n
?
(e

h
,r,e

t
)log�s(�(e

h
,e

t
) +�),{(e

h
,r,e

t
)}
arennegative triples.
The final objective function:
L=Lllm+�Llp
where�is a trade-off weight for balancing two losses.
26 / 38

GNP
Experiments
Table of contents
1Abstract
2Introduction
3Preliminary
4Methodology
5Experiments
6Conclusion
27 / 38

GNP
Experiments
Experiment setup
Knowledge Graphs and Datasets
1General domain (commonsense reasoning)
2Biomedical domain (biomedical reasoning)
Two Settings
1LLM Frozen
2LLM Tuned
Baselines
1LLM-only
2Hard prompts (three prompt design methods)
3KG Flattening
4One-hop (OH) and two-hop (TH)
5Prompt tuning (soft prompts)
28 / 38

GNP
Experiments
Table 1:
biomedical reasoning tasks.
29 / 38

GNP
Experiments
Ablation Study
Analyzing the contributions of different components by removing
each of them independently.
Table 2:
30 / 38

GNP
Experiments
Model Design Comparison
For prompt tuning, they use dataset-level prompt (DLP). For modeling
relations, they use Relational GNN (RGNN) [5].
Table 3:
31 / 38

GNP
Experiments
Impact of GNN layers
They evaluate the influence of GNN layers for both 3B and 11B
models.
Figure 3: w.r.t.different number of GNN layers.
32 / 38

GNP
Experiments
Impact of cross-modality pooling layers
They report the performance of different cross-modality pooling
layers.
Figure 4: w.r.t.different number of cross-modality pooling
layers.
33 / 38

GNP
Experiments
Case Study and Visualization
They select two examples from the OBQA dataset and visualize the
retrieved subgraphs.
Figure 5:
34 / 38

GNP
Conclusion
Table of contents
1Abstract
2Introduction
3Preliminary
4Methodology
5Experiments
6Conclusion
35 / 38

GNP
Conclusion
Conclusion
They propose Graph Neural Prompting (GNP), a novel plug-and-play
method to assist pre-trained LLMs in learning beneficial knowledge
from KGs.
36 / 38

GNP
Conclusion
References
[1]
Language Model Prompting for Zero-Shot Knowledge Graph
Question Answering”. Proceedings of the 1st Workshop on
Natural Language Reasoning and Structured Explanations
(NLRSE). Association for Computational Linguistics, 2023,
pp. 78–106.
[2]
Graphs: Representation, Acquisition, and Applications”.
IEEE Transactions on Neural Networks and Learning Systems
(2022), pp. 494–514.
37 / 38

GNP
Conclusion
References
[3]
Relations for Learning and Inference in Knowledge Bases”.
Proceedings of the International Conference on Learning
Representations (ICLR) 2015. 2015.
[4]
Bidirectional Language-Knowledge Graph Pretraining”.
Advances in Neural Information Processing Systems. 2022,
pp. 37309–37323.
[5]
Language Models”. International Conference on Learning
Representations. 2021.
38 / 38