GNP
Abstract
Abstract
To reduce the limitation of the large language model, the existing
work enhances pre-trained LLMs using grounded knowledge.
For example, retrieval-augmented generation, remains an open
question, knowledge graphs (KG).
In this paper, they proposegraph neural prompting (GNP), a
plug-and-play method to help pre-trained LLMs gain beneficial
knowledge from KGs.
4 / 38
GNP
Introduction
Introduction
Knowledge graphs (KGs) store enormous facts and serve as a
systematic way of presenting knowledge [2].
Existing methods [4] have incorporated KGs with language models by
designing customized model architectures to accommodate both
textual data and KGs.
The joint training model becomes challenging due to the parameter
size of the language model.
6 / 38
GNP
Introduction
Introduction (Cont.)
A direct way to combine the benefit of KGs and the language model
feeding the KG triples1into LLMs [1].
However, this method can introduce substantial noise since KGs might
contain various extraneous contexts.
Can we learn beneficial knowledge from KGs
and integrate them into pre-trained LLMs?
1KG triples are structured data elements consisting of a subject, predicate, and
object, representing relationships between entities in a Knowledge Graph.
7 / 38
GNP
Introduction
Graph Neural Prompting
They propose a method that retrieves and encodes the pertinent
grounded knowledge to derive a Graph Neural Prompt.
The prompt is an embedding vector that can be sent to LLMs for
guidance and instructions.
8 / 38
GNP
Preliminary
Knowledge Graph
A knowledge graph is defined asG=(E,R,T ).
Eis the set of entities.
Ris the set of relations.
Tis the collection of fact tripes{(eh,r,et)} ∈ E × R × E, where
ehis the head entity,ris the relation, andetis the tail entity.
11 / 38
GNP
Preliminary
Multiple Choice Question Answering
Given a questionQ, a set of answer optionsA={ak}
K
k=1
, and an
optional contextC. The ground truth labely∈Ais the correct answer.
We need to design a modelFΘthat selects the best option to answer
the question. In addition, this paper uses knowledge graphGto
provide external knowledge and help the model to answer the question.
12 / 38
GNP
Methodology
Table of contents
6Conclusion
14 / 38
GNP
Methodology
Prompting LLMs for Question Answering
Prompting LLMs for Question Answering
Below are the steps:
1Tokenizing the concatenation ofC,Q,Ainto a sequence of input
text tokensX.
2Designing a series of prompt tokensP, pretending it to the input
tokensX. Using it as input for the LLM model to generate
predictiony
′
=f([P,X]).
3The model can be trained for downstream task adaptation using
standard maximum likelihood loss using teacher forcing and a
cross-entropy lossLllm=−logp(y|X,Θ).
15 / 38
GNP
Methodology
Prompting LLMs for Question Answering
Prompting LLMs for Question Answering (Cont.)
This paper provides a prompt method not using text, but structural and
factual information contained in the knowledge graphGinto a soft
prompt.
They concatenate the soft prompt with the token embeddings ofX.
16 / 38
GNP
Methodology
Subgraph Retrieval
Subgraph Retrieval
They retrieve subgraphs of G that contain the relevant entities to the
token in X.
Based on corresponding contextCand questionQ, each answer
optionak. They find a set of matched entitiesEmatchvia entity linking
to match tokens inXto the entity inG.
After that, they retrieve a subgraphG
′
by usingEmatchwith their
two-hop neighbors and the relations that connect them.
17 / 38
GNP
Methodology
Graph Neural Prompting
GNN Encoder
They introduce a GNN to encode the most relevant knowledge and
integrate the relationships among the entities.
Firstly, they initialize the node embeddings using pre-trained entity
embeddings.
Then, they employ a standard graph attention network as their GNN
encoder for the subgraphG
′
.
H1=fGNN(G
′
)
whereH1∈R
dg
represents the learned node embeddings, anddgis the
output dimension of the GNN encoder.
20 / 38
GNP
Methodology
Graph Neural Prompting
Cross-modality Pooling
They design the cross-modality pooling to identify the most pertinent
nodes related to the question. They use a self-attention layer first.
H2=Self-Attn(H1)
Second, they leverage the text prompt, which obtains as text
embeddingsT ∈R
dt
to calculate the importance of nodes within the
graph, wheredtis the dimension of the LLM dictionary.
Additionally, they apply a transformation forTto match the
dimensiondg.
21 / 38
GNP
Methodology
Graph Neural Prompting
Cross-modality Pooling (Cont.)
Then, they calculate the cross-modality attention usingH2andT
′
.
H3=softmax[H2· (T
′
)
T
/
√︁
dg] · T
′
whereT
′
=FFN1(�(FFN2(T ))),�is the GELU activation function.
Finally, they generate graph-level embedding by average pooling the
node embeddingsH3inG
′
.
H4=POOL(H3)
22 / 38
GNP
Methodology
Graph Neural Prompting
Domain Projector
To create a mapping between the graph-level embeddings and the text
domain to facilitate comprehension by the LLM, they design a domain
projector to align them.
In addition, the projector needs to consider the dimension size of
LLM, such that the projector is designed as follows:
Z=FFN3(�(FFN4(H4)))
whereZis the Graph Neural Prompt.
23 / 38
GNP
Methodology
Graph Neural Prompting
Self-supervised Link Prediction
In this section, they design an objective function to enable the model
to learn and adapt to the target dataset.
They mask some edges fromG
′
and enforce the model to predict
them, which the set of masked-out edges asEmask⊆ E.
They adopt a widely used knowledge graph embedding method
DistMult [3] to map the entity embeddings and relation in the KG to
vectors,h, r, t, where entity fromH3.
24 / 38
GNP
Methodology
Graph Neural Prompting
Self-supervised Link Prediction (Cont.)
To distinguish between correct positive triples and incorrect negative
triples, a scoring function�(eh,et)=⟨h,r,t⟩is defined, where⟨·,·,·⟩
denotes the trilinear dot product andrrepresents the relations in
knowledge graphs (KGs).
The model is trained to predict the masked edges inEmaskas positive
samples, while other randomly selected edges are treated as negative
samples.
Llp=
∑︁
(eh,r,et) ∈ Emask
(Spos+Sneg)
25 / 38
GNP
Methodology
Graph Neural Prompting
Self-supervised Link Prediction (Cont.)
Llp=
∑︁
(eh,r,et) ∈ Emask
(Spos+Sneg)
whereSpos=−log�s(�(eh,et) +�), and�is margin,�sis the
sigmoid function.Sneg=
1
n
?
(e
′
h
,r,e
′
t
)log�s(�(e
′
h
,e
′
t
) +�),{(e
′
h
,r,e
′
t
)}
arennegative triples.
The final objective function:
L=Lllm+�Llp
where�is a trade-off weight for balancing two losses.
26 / 38
GNP
Experiments
Ablation Study
Analyzing the contributions of different components by removing
each of them independently.
Table 2:
30 / 38
GNP
Experiments
Model Design Comparison
For prompt tuning, they use dataset-level prompt (DLP). For modeling
relations, they use Relational GNN (RGNN) [5].
Table 3:
31 / 38
GNP
Experiments
Impact of GNN layers
They evaluate the influence of GNN layers for both 3B and 11B
models.
Figure 3: w.r.t.different number of GNN layers.
32 / 38
GNP
Experiments
Impact of cross-modality pooling layers
They report the performance of different cross-modality pooling
layers.
Figure 4: w.r.t.different number of cross-modality pooling
layers.
33 / 38
GNP
Experiments
Case Study and Visualization
They select two examples from the OBQA dataset and visualize the
retrieved subgraphs.
Figure 5:
34 / 38
GNP
Conclusion
Conclusion
They propose Graph Neural Prompting (GNP), a novel plug-and-play
method to assist pre-trained LLMs in learning beneficial knowledge
from KGs.
36 / 38
GNP
Conclusion
References
[1]
Language Model Prompting for Zero-Shot Knowledge Graph
Question Answering”. Proceedings of the 1st Workshop on
Natural Language Reasoning and Structured Explanations
(NLRSE). Association for Computational Linguistics, 2023,
pp. 78–106.
[2]
Graphs: Representation, Acquisition, and Applications”.
IEEE Transactions on Neural Networks and Learning Systems
(2022), pp. 494–514.
37 / 38
GNP
Conclusion
References
[3]
Relations for Learning and Inference in Knowledge Bases”.
Proceedings of the International Conference on Learning
Representations (ICLR) 2015. 2015.
[4]
Bidirectional Language-Knowledge Graph Pretraining”.
Advances in Neural Information Processing Systems. 2022,
pp. 37309–37323.
[5]
Language Models”. International Conference on Learning
Representations. 2021.
38 / 38