Beyond Limits: How GraphRAG Revolutionises Data Interaction
neo4j
383 views
64 slides
Sep 23, 2024
Slide 1 of 64
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
About This Presentation
Presented at Big Data London 2024 by Kristof Neys, Director GDS Technology at Neo4j.
From a data perspective, an ideal scenario is one where practitioners can have a meaningful conversation with their data. In an era where data is both abundant and critical, the need for innovative methods to inte...
Presented at Big Data London 2024 by Kristof Neys, Director GDS Technology at Neo4j.
From a data perspective, an ideal scenario is one where practitioners can have a meaningful conversation with their data. In an era where data is both abundant and critical, the need for innovative methods to interact with and understand complex datasets has never been greater. Enter GraphRAG (Graph-based Retrieval-Augmented Generation), a cutting-edge approach that revolutionizes data interaction by seamlessly integrating graph theory with generative AI. GraphRAG leverages the power of a knowledge graph to represent relationships within data, enabling more intuitive navigation and retrieval of relevant information. By augmenting these capabilities with state-of-the-art generation models, GraphRAG provides users with enriched, context-aware outputs that significantly surpass traditional query-response systems. Attendees will gain insights into the underlying principles of GraphRAG, its architectural components, and practical applications across various domains, from healthcare to finance. We will demonstrate real-world use cases, showcasing how GraphRAG not only improves efficiency and accuracy in data handling but also democratizes access to complex insights, empowering users to reach their ideal state of conversing with their data. Join us to discover how GraphRAG is paving the way for the future of intelligent data interaction.
Size: 7.24 MB
Language: en
Added: Sep 23, 2024
Slides: 64 pages
Slide Content
Beyond Limits: GraphRAG
Kristof Neys, Neo4j Field Engineering
Outline
1)Brave New World…
2)Why RAG?
3)Can Knowledge Graphs help?
4)From RAG to GraphRAG
5)Improve your R
6)An example….
7)Does it work?
Neo4j Inc. All rights reserved 20232
But First -A Word from our sponsor…
Brave New World…
The State of Generative AI
GENAI
GENAI
GENAI
GENAI
The Good !
The State of Generative AI
GENAI
GENAI
The Good !
The State of Generative AI
GENAI
GENAI
The Good !
The State of Generative AI
The Bad "
GENAI
GENAI
The Good !
The State of Generative AI
The Bad "
GENAI
GENAI
GENAI
The Good !The Bad "
The State of Generative AI
The Ugly #
GENAI
GENAI
GENAI
The State of Generative AI
The Good !The Bad "The Ugly #
GenAIAlone != Right Outcomes $
GENAI
Challenges with GenAI: Stochastic Parrot?
●Lack of enterprise domain knowledge
●Inability to verify answers
●Hallucination
●Ethical and data bias concerns
●and more
12Neo4j Inc. All rights reserved 2024
GenAIPARROT
13
ManagingAI risk
is the biggest
barrierto scaling
AI initiatives1
Skepticism: Over half of business leaders are
skeptical in adopting GenAI.2
Neo4j Inc. All rights reserved 2024
Explainability: Over 80% of executives worry
about non-transparent nature of GenAIcould
result in poor or unlawful decisions.2
Reliability:Inaccuracy and hallucination are two of
the most-cited risks of adopting GenAI
technology at all levels of an organisation.3
1. Deloitte’s State of AI in the Enterprise2. BCG’s Digital Acceleration Index Study 20233. McKinsey: The state of AI in 2023
14Neo4j Inc. All rights reserved 2024
How can enterprises use
domain-specific knowledge
to rapidly build accurate,
contextual, andexplainable
GenAIapplications?
Problem
Statement
Why RAG?
And what is it anyway…
Retrieval Augmented Generation:
The ability to dynamically query a large
text corpus to incorporate relevant factual
knowledge into the responses generated
by the underlying language model
Neo4j Inc. All rights reserved 202317
RAG augments LLMs by retrieving up-to-date,
contextual external data to inform responses:
Retrieve-Find documents of interest for the
user question
Augment: Combine the user question with
the relevant documents
Generate: Feed enhanced prompt to an LLM
and obtain answer
Retrieval Augmented Generation
Database of Truth
RAG is becoming an industry standard
Why RAG With Vector Databases Fall Short
Similarity is insufficient for rich enterprise reasoning
Neo4j Inc. All rights reserved 202418
1
3
2
4
Only leverage a fraction of
your data: Beyond simple
“metadata”, vector databases
alone fail to capture relationships
from structured data
Miss critical context:Struggle to
capture connections across
nuanced facts, making it
challenging to answer multi-step,
domain-specific, questions
Vector Similarity ≠ Relevance:
Vector search uses an incomplete
measure of similarity. Relying on it
solely can result in irrelevant and
duplicative results
Lack explainability:
The black-box nature of
vectors lacks transparency
and explainability
Can Knowledge Graphs
help?
Recap a Knowledge Graph
A knowledge graph is a
structured representation
of facts, consisting of
entities, relationships and
semantic descriptions
20Neo4j Inc. All rights reserved 2024
MICAANDRE
Name: “Andre”
Born: May 29, 1970
Twitter: “@andre123”
Name: “Mica”
Born: Dec 5, 1975
CAR
Brand “Volvo”
Model: “V70”
Description: ‘Blue external, red seats’
#days: 5/7
LOVES
LOVES
LIVES WITH
OWNS
DRIVES
Since:
Jan 10, 2011
Knowledge Graphs –New & Improved! NOW WITH VECTORS!
Now with Vectors!
Vectors as Node properties
=
Vector Search + Graph
Traversal
22Neo4j Inc. All rights reserved 2024
MICAANDRE
Name: “Andre”
Born: May 29, 1970
Twitter: “@andre123”
Name: “Mica”
Born: Dec 5, 1975
CAR
Brand “Volvo”
Model: “V70”
Description:
#days: 5/7
LOVES
LOVES
LIVES WITH
OWNS
DRIVES
Since:
Jan 10, 2011
Neo4j Inc. All rights reserved 202323
Neo4j -Vector Database Capabilities
Vector SearchData ScienceKnowledge
Graph
●Find nodes using an implicit similarity search in
the vector index*and enrichwith additional
explicit relationshipsfrom the knowledge graph
●Hybrid Searchwith text
●Create vectors of network information using
node embeddings
Now atop 10 vector database on LangChain.
Neo4j Inc. All rights reserved 202324
By 2025, 50% of generative AI initiatives
will have improved reliabilityand
transparencyby combining deep learning
foundation models with knowledge graphs
or other composite AI elements.
Technological Implications of Generative AI, August 2023
Impact Radar for GenAI (2024)
From RAG to GraphRAG
GraphRAG
Technique for richly
understanding text datasets
by combining text extraction,
network analysis, LLM
prompting and summarization
into a single end-to-end
system
Neo4j Inc. All rights reserved 2024
A Neo4j Knowledge Graph combined with LLM’s
obtains some unique improvements:
Accuracy-Obtain better answers compared
to plain vector searches
Specificity: domain specific, factual
knowledge on your subject
Explainability: Provide the user with more
reasoning on how the results were obtained.
Security: Role Based Access Control
Retrieval Augmented Generation
Evolving From RAG to GraphRAG
We are not making this up…
Neo4j Inc. All rights reserved 202328
You need a better R…
Quick quiz for SWAG
Neo4j Inc. All rights reserved 202330
Neo4j Inc. All rights reserved 202331
Neo4j Inc. All rights reserved 202332
Data Science on Graphs: Graph Data Science…
Vector SearchGraph
Data Science
Knowledge
Graph
Bring the context of your connected data into
a format that other pipelines can ingest.
The Largest Catalog of
Graph Algorithms
Graph Vector Embeddings
for Machine Learning
At an
inflection
point…
Neo4j Inc. All rights reserved 202334
GraphRAGwith Neo4j
Find similar documents
and content
Identify entities
associated to content and
patterns
in connected data
Improve GenAIinferences
and insights. Discover new
relationships and entities
Unify vector search, knowledge graph and data science
capabilities to improve RAG quality and effectiveness
Vector SearchGraph
Data Science
Knowledge
Graph
An example…
RFP Generation GenAIApp
35Neo4j Inc. All rights reserved 2024
Why a KG Matters in RFP GenAIApp?
36
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024
Anatomy of an RFP Document
37
AWS RFP
IntroObjectivesProposal
About the Company
Financial Result
ContentSubsection 1
Subsection 2
Subsection 1.1Content
Content Content
Content
ContentContent
ContentContent
ContentContent
Neo4j Inc. All rights reserved 2024
Anatomy of an RFP Document
38
AWS RFP
IntroObjectivesProposal
About the Company
Financial Result
ContentSubsection 1
Subsection 2
Subsection 1.1Content
Content Content
Content
ContentContent
ContentContent
ContentContent
Neo4j Inc. All rights reserved 2024
RFP Document as a Graph
39
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
EmbeddingVector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Neo4j Inc. All rights reserved 2024
Why Neo4j KG Matters in RFP GenAIApp?
40
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024
Why Neo4j KG Matters in RFP GenAIApp?
41
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
EmbeddingVector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Knowledge Graph as the Knowledge Base
Document in a KGKnowledge Graph
42Neo4j Inc. All rights reserved 2024
Knowledge Graph as the Knowledge Base
Document in a KGKnowledge Graph
43Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
EmbeddingVector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Why Neo4j KG Matters in RFP GenAIApp?
44
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024
Why Neo4j KG Matters in RFP GenAIApp?
45
ChallengesOutcomes
Time consuming to read previous
RFP across multiple repositories
Knowledge baseto collect, store
and retrieve domain-specific
information
Repetitive andmanual tasks to
synthesisethe content
Drive efficient,accurate,
contextual and explainableway to
streamline RFPresponses
Non-standard structure of RFP
making it difficult to do data
modelling
Flexible storage that’s adoptable
to the varying structure of an RFP
Neo4j Inc. All rights reserved 2024
Accurate, Contextual and Explainable
46Neo4j Inc. All rights reserved 2024
GenAIApp
Who is the main respondent
of the AWS RFP?
Accurate, Contextual and Explainable
47Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
Question
Accurate, Contextual and Explainable
48Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
QuestionVector Embedding
Accurate, Contextual and Explainable
49Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
Question
Similarity Search using
Neo4j Vector Index
Vector Embedding
Accurate, Contextual and Explainable
50Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Similarity Search using
Neo4j Vector Index
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Accurate, Contextual and Explainable
51Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Contextual
Knowledge Retrieval
within Neo4j KG
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Accurate, Contextual and Explainable
52Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Knowledge Retrieval
to aid in
Explainability
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Accurate, Contextual and Explainable
53Neo4j Inc. All rights reserved 2024
AWS RFP
Intro Objectives Proposal
Content
Chunk
Content
Chunk
Content
Chunk
Financial
Result
About the
Company
Content
Chunk
Content
Chunk
Content
Chunk
Subsection
2
Subsection
1
Content
Chunk
Content
Chunk
Subsection
1.1
Content
Chunk
Content
Chunk
Content
Chunk
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Vector
Embedding
Fine Grained Access
Control to prevent
unwarranted Knowledge
Retrieval
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Accurate, Contextual and Explainable
54Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
Question
Similarity Search using
Neo4j Vector Index
Vector Embedding
Accurate, Contextual and Explainable
55Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
QuestionVector Embedding
AWS RFP
IntroObjectivesProposal
ContentChunkContentChunkContentChunkFinancial ResultAbout the Company
ContentChunkContentChunkContentChunk
Subsection 2Subsection 1
ContentChunkContentChunk Subsection 1.1
ContentChunkContentChunkContentChunkVector EmbeddingVector EmbeddingVector EmbeddingVector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Vector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Similarity Result
Similarity Search using
Neo4j Vector Index
Bedrock
LLM
Accurate, Contextual and Explainable
56Neo4j Inc. All rights reserved 2024
Who is the main
respondent of the
AWS RFP?
GenAIAppEmbedding
Model
User
QuestionVector Embedding
AWS RFP
IntroObjectivesProposal
ContentChunkContentChunkContentChunkFinancial ResultAbout the Company
ContentChunkContentChunkContentChunk
Subsection 2Subsection 1
ContentChunkContentChunk Subsection 1.1
ContentChunkContentChunkContentChunkVector EmbeddingVector EmbeddingVector EmbeddingVector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Vector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Similarity Result
Similarity Search using
Neo4j Vector Index
The main respondent of
the AWS RFP is John
Smith, General Manager
XYZ Group representing
ABC Company (source:
RFP_AWS.pdf, page: 38)
AWS RFP
IntroObjectivesProposal
ContentChunkContentChunkContentChunkFinancial ResultAbout the Company
ContentChunkContentChunkContentChunk
Subsection 2Subsection 1
ContentChunkContentChunk Subsection 1.1
ContentChunkContentChunkContentChunkVector EmbeddingVector EmbeddingVector EmbeddingVector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Vector Embedding
Vector EmbeddingVector EmbeddingVector Embedding
Response
from ABC
Company
Signed by
John
Smith
He’s the
General
Manager
Similarity +
Contextual Result
Bedrock
LLM
How about Graph
Data Science?
57
Enrich the measure of relevancy
using graph algorithms.
●Page Rank to understand the
importance of parts of
documents
●Link Prediction to find hidden
relationships that further
contextualise the results
●Community Detection to group
related parts of documents for
more focused knowledge
retrieval
Neo4j Inc. All rights reserved 2024
Try it yourself… -Neo4j Graphbuilder
But does it REALLY work??
61Neo4j Inc. All rights reserved 2024
Neo4j Inc. All rights reserved 202362
Let’s Wrap up…
GraphRAGenables you to..:
●To leverage structural
information across entities to
enable more precise and
comprehensive retrieval
●To perform advanced Graph
analytics to enhance retrieval
●To have an accurate conversation
with your data that is explainable
64