GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph

neo4j 749 views 15 slides Jun 17, 2024
Slide 1
Slide 1 of 15
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15

About This Presentation

Tomaz Bratanic
Graph ML and GenAI Expert - Neo4j


Slide Content

© 2022 Neo4j, Inc. All rights reserved. 1
GraphRAG for Life Science


By Tomaz Bratanic

2
LLM-based chatbots
Using the retrieval-augmented generation

© 2022 Neo4j, Inc. All rights reserved. 3
LLMs are great, but:
•Knowledge
cutoff date
•Hallucinations
•Lack of
information
source
•Lack of
domain-specific
or private
information

Retrieval-augmented generation (in-context passing) flow

Chat with your PDF - Vector similarity search

© 2022 Neo4j, Inc. All rights reserved. 7
●Consists of nodes
and relationships
●Nodes are used to
represent entities,
concepts, and more
●Relationships
describe the context
and how it is
connected to other
nodes

Biomedical knowledge graph

Constructing a biomedical knowledge graph
Using existing biomedical ontologies combined with NLP extraction

Graph RAG
MATCH(g:Gene)
WHERE g.id = $gene
MATCH (g)-[r:VARIANT_FOUND_IN_GENE ]-(v:Known_variant)
WHERE NOT v.clinical_relevance = '-' AND NOT v.disease = '-'
RETURN g.id as gene, v.disease AS disease,
v.pvariant_id AS variant_id, v.clinical_relevance as relevance
LIMIT 20

Generating Cypher with LLM lack accuracy and robustness

Solution: Agentic tools with predefined Cypher templates

Providing agent with tools

Agent tool definition
MATCH (p:Protein)-[r:ASSOCIATED_WITH]-(d:Disease)
WHERE d.name CONTAINS $disease
OPTIONAL MATCH
(p)-[r2:ASSOCIATED_WITH]->(t:Tissue)
RETURN d.name as disease, p.name as drug_target,
t.name as expressed_tissue
LIMIT 10
class DiseaseTissueInput(BaseModel):
disease: str = Field(description= "disease mentioned in
the question")


class DiseaseTissueTool(BaseTool):
name = "DiseaseTissue"
description = "useful for when you need to find tissues
where proteins with a specific disease are expressed"
Parameterized Cypher
statement


Parameter description
Tool description

14
Demo time
● What are associated foods with cancer?
● What are known clinically relevant gene variants for gene CLN3?
● In which tissues are expressed proteins associated with prostate carcinoma?

15
Questions
Repository: https://github.com/tomasonjo/clinical-agent
Slides:
https://www.slideshare.net/slideshow/graphrag-for-life-science-to-increase-llm-acc
uracy/269616483
Email: [email protected]
LinkedIn: https://www.linkedin.com/in/tomaz-bratanic-a58891127/