Beyond Limits: How GraphRAG Revolutionises Data Interaction

neo4j 383 views 64 slides Sep 23, 2024
Slide 1
Slide 1 of 64
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64

About This Presentation

Presented at Big Data London 2024 by Kristof Neys, Director GDS Technology at Neo4j.

From a data perspective, an ideal scenario is one where practitioners can have a meaningful conversation with their data. In an era where data is both abundant and critical, the need for innovative methods to inte...


Slide Content

Beyond Limits: GraphRAG
Kristof Neys, Neo4j Field Engineering

Outline
1)Brave New World…
2)Why RAG?
3)Can Knowledge Graphs help?
4)From RAG to GraphRAG
5)Improve your R
6)An example….
7)Does it work?
Neo4j Inc. All rights reserved 20232

But First -A Word from our sponsor…

Brave New World…

The State of Generative AI
GENAI
GENAI

GENAI
GENAI
The Good !
The State of Generative AI

GENAI
GENAI
The Good !
The State of Generative AI

GENAI
GENAI
The Good !
The State of Generative AI
The Bad "

GENAI
GENAI
The Good !
The State of Generative AI
The Bad "
GENAI

GENAI
GENAI
The Good !The Bad "
The State of Generative AI
The Ugly #
GENAI

GENAI
GENAI
The State of Generative AI
The Good !The Bad "The Ugly #
GenAIAlone != Right Outcomes $
GENAI

Challenges with GenAI: Stochastic Parrot?
●Lack of enterprise domain knowledge
●Inability to verify answers
●Hallucination
●Ethical and data bias concerns
●and more
12Neo4j Inc. All rights reserved 2024
GenAIPARROT

13
ManagingAI risk
is the biggest
barrierto scaling
AI initiatives1
Skepticism: Over half of business leaders are
skeptical in adopting GenAI.2
Neo4j Inc. All rights reserved 2024
Explainability: Over 80% of executives worry
about non-transparent nature of GenAIcould
result in poor or unlawful decisions.2
Reliability:Inaccuracy and hallucination are two of
the most-cited risks of adopting GenAI
technology at all levels of an organisation.3
1. Deloitte’s State of AI in the Enterprise2. BCG’s Digital Acceleration Index Study 20233. McKinsey: The state of AI in 2023

14Neo4j Inc. All rights reserved 2024
How can enterprises use
domain-specific knowledge
to rapidly build accurate,
contextual, andexplainable
GenAIapplications?
Problem
Statement

Why RAG?
And what is it anyway…

Retrieval Augmented Generation:
The ability to dynamically query a large
text corpus to incorporate relevant factual
knowledge into the responses generated
by the underlying language model

Neo4j Inc. All rights reserved 202317
RAG augments LLMs by retrieving up-to-date,
contextual external data to inform responses:
Retrieve-Find documents of interest for the
user question
Augment: Combine the user question with
the relevant documents
Generate: Feed enhanced prompt to an LLM
and obtain answer
Retrieval Augmented Generation
Database of Truth
RAG is becoming an industry standard

Why RAG With Vector Databases Fall Short
Similarity is insufficient for rich enterprise reasoning
Neo4j Inc. All rights reserved 202418
1
3
2
4
Only leverage a fraction of
your data: Beyond simple
“metadata”, vector databases
alone fail to capture relationships
from structured data
Miss critical context:Struggle to
capture connections across
nuanced facts, making it
challenging to answer multi-step,
domain-specific, questions
Vector Similarity ≠ Relevance:
Vector search uses an incomplete
measure of similarity. Relying on it
solely can result in irrelevant and
duplicative results
Lack explainability:
The black-box nature of
vectors lacks transparency
and explainability

Can Knowledge Graphs
help?

Recap a Knowledge Graph
A knowledge graph is a
structured representation
of facts, consisting of
entities, relationships and
semantic descriptions
20Neo4j Inc. All rights reserved 2024
MICAANDRE
Name: “Andre”
Born: May 29, 1970
Twitter: “@andre123”
Name: “Mica”
Born: Dec 5, 1975
CAR
Brand “Volvo”
Model: “V70”
Description: ‘Blue external, red seats’
#days: 5/7
LOVES
LOVES
LIVES WITH
OWNS
DRIVES
Since:
Jan 10, 2011

Knowledge Graphs –New & Improved! NOW WITH VECTORS!

Now with Vectors!
Vectors as Node properties
=
Vector Search + Graph
Traversal
22Neo4j Inc. All rights reserved 2024
MICAANDRE
Name: “Andre”
Born: May 29, 1970
Twitter: “@andre123”
Name: “Mica”
Born: Dec 5, 1975
CAR
Brand “Volvo”
Model: “V70”
Description:
#days: 5/7
LOVES
LOVES
LIVES WITH
OWNS
DRIVES
Since:
Jan 10, 2011

Neo4j Inc. All rights reserved 202323
Neo4j -Vector Database Capabilities
Vector SearchData ScienceKnowledge
Graph
●Find nodes using an implicit similarity search in
the vector index*and enrichwith additional
explicit relationshipsfrom the knowledge graph
●Hybrid Searchwith text
●Create vectors of network information using
node embeddings
Now atop 10 vector database on LangChain.

Neo4j Inc. All rights reserved 202324
By 2025, 50% of generative AI initiatives
will have improved reliabilityand
transparencyby combining deep learning
foundation models with knowledge graphs
or other composite AI elements.
Technological Implications of Generative AI, August 2023
Impact Radar for GenAI (2024)

From RAG to GraphRAG

GraphRAG
Technique for richly
understanding text datasets
by combining text extraction,
network analysis, LLM
prompting and summarization
into a single end-to-end
system

Neo4j Inc. All rights reserved 2024
A Neo4j Knowledge Graph combined with LLM’s
obtains some unique improvements:
Accuracy-Obtain better answers compared
to plain vector searches
Specificity: domain specific, factual
knowledge on your subject
Explainability: Provide the user with more
reasoning on how the results were obtained.
Security: Role Based Access Control
Retrieval Augmented Generation
Evolving From RAG to GraphRAG

We are not making this up…
Neo4j Inc. All rights reserved 202328

You need a better R…

Quick quiz for SWAG
Neo4j Inc. All rights reserved 202330

Neo4j Inc. All rights reserved 202331

Neo4j Inc. All rights reserved 202332
Data Science on Graphs: Graph Data Science…
Vector SearchGraph
Data Science
Knowledge
Graph
Bring the context of your connected data into
a format that other pipelines can ingest.
The Largest Catalog of
Graph Algorithms
Graph Vector Embeddings
for Machine Learning

At an
inflection
point…

Neo4j Inc. All rights reserved 202334
GraphRAGwith Neo4j
Find similar documents
and content
Identify entities
associated to content and
patterns
in connected data
Improve GenAIinferences
and insights. Discover new
relationships and entities
Unify vector search, knowledge graph and data science
capabilities to improve RAG quality and effectiveness
Vector SearchGraph
Data Science
Knowledge
Graph

An example…
RFP Generation GenAIApp
35Neo4j Inc. All rights reserved 2024

Why a KG Matters in RFP GenAIApp?
36
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024

Anatomy of an RFP Document
37
AWS RFP
IntroObjectivesProposal
About the Company
Financial Result
ContentSubsection 1
Subsection 2
Subsection 1.1Content
Content Content
Content
ContentContent
ContentContent
ContentContent
Neo4j Inc. All rights reserved 2024

Anatomy of an RFP Document
38
AWS RFP
IntroObjectivesProposal
About the Company
Financial Result
ContentSubsection 1
Subsection 2
Subsection 1.1Content
Content Content
Content
ContentContent
ContentContent
ContentContent
Neo4j Inc. All rights reserved 2024

RFP Document as a Graph
39
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
EmbeddingVector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Neo4j Inc. All rights reserved 2024

Why Neo4j KG Matters in RFP GenAIApp?
40
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024

Why Neo4j KG Matters in RFP GenAIApp?
41
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024

AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
EmbeddingVector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Knowledge Graph as the Knowledge Base
Document in a KGKnowledge Graph
42Neo4j Inc. All rights reserved 2024

Knowledge Graph as the Knowledge Base
Document in a KGKnowledge Graph
43Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
EmbeddingVector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding

Why Neo4j KG Matters in RFP GenAIApp?
44
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024

Why Neo4j KG Matters in RFP GenAIApp?
45
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024

Accurate, Contextual and Explainable
46Neo4j Inc. All rights reserved 2024
GenAIApp
Who is the main respondent
of the AWS RFP?

Accurate, Contextual and Explainable
47Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
Question

Accurate, Contextual and Explainable
48Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
QuestionVector Embedding

Accurate, Contextual and Explainable
49Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
Question
Similarity Search using
Neo4j Vector Index
Vector Embedding

Accurate, Contextual and Explainable
50Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Similarity Search using
Neo4j Vector Index
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager

Accurate, Contextual and Explainable
51Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Contextual
Knowledge Retrieval
within Neo4j KG
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager

Accurate, Contextual and Explainable
52Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Knowledge Retrieval
to aid in
Explainability
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager

Accurate, Contextual and Explainable
53Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Fine Grained Access
Control to prevent
unwarranted Knowledge
Retrieval
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager

Accurate, Contextual and Explainable
54Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
Question
Similarity Search using
Neo4j Vector Index
Vector Embedding

Accurate, Contextual and Explainable
55Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
QuestionVector Embedding
AWS RFP
IntroObjectivesProposal
ContentChunkContentChunkContentChunkFinancial ResultAbout the Company
ContentChunkContentChunkContentChunk
Subsection 2Subsection 1
ContentChunkContentChunk Subsection 1.1
ContentChunkContentChunkContentChunkVector EmbeddingVector EmbeddingVector EmbeddingVector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Vector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Similarity Result
Similarity Search using
Neo4j Vector Index
Bedrock
LLM

Accurate, Contextual and Explainable
56Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
QuestionVector Embedding
AWS RFP
IntroObjectivesProposal
ContentChunkContentChunkContentChunkFinancial ResultAbout the Company
ContentChunkContentChunkContentChunk
Subsection 2Subsection 1
ContentChunkContentChunk Subsection 1.1
ContentChunkContentChunkContentChunkVector EmbeddingVector EmbeddingVector EmbeddingVector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Vector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Similarity Result
Similarity Search using
Neo4j Vector Index
The main respondent of
the AWS RFP is John
Smith, General Manager
XYZ Group representing
ABC Company (source:
RFP_AWS.pdf, page: 38)
AWS RFP
IntroObjectivesProposal
ContentChunkContentChunkContentChunkFinancial ResultAbout the Company
ContentChunkContentChunkContentChunk
Subsection 2Subsection 1
ContentChunkContentChunk Subsection 1.1
ContentChunkContentChunkContentChunkVector EmbeddingVector EmbeddingVector EmbeddingVector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Vector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Similarity +
Contextual Result
Bedrock
LLM

How about Graph
Data Science?
57
Enrich the measure of relevancy
using graph algorithms.
●Page Rank to understand the
importance of parts of
documents
●Link Prediction to find hidden
relationships that further
contextualise the results
●Community Detection to group
related parts of documents for
more focused knowledge
retrieval
Neo4j Inc. All rights reserved 2024

Try it yourself… -Neo4j Graphbuilder

But does it REALLY work??

61Neo4j Inc. All rights reserved 2024

Neo4j Inc. All rights reserved 202362

Let’s Wrap up…

GraphRAGenables you to..:
●To leverage structural
information across entities to
enable more precise and
comprehensive retrieval
●To perform advanced Graph
analytics to enhance retrieval
●To have an accurate conversation
with your data that is explainable
64

Neo4j Inc. All rights reserved 202365
Thank You