Workshop - Architecting Innovative Graph Applications- GraphSummit Milan

neo4j 93 views 100 slides May 14, 2024
Slide 1
Slide 1 of 100
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100

About This Presentation

Workshop 1. Architecting Innovative Graph Applications

Join this hands-on workshop for beginners led by Neo4j experts guiding you to systematically uncover contextual intelligence. Using a real-life dataset we will build step-by-step a graph solution; from building the graph data model to running q...


Slide Content

Neo4j Inc. All rights reserved 2024
May 14 2024
Milano

Architecting Innovative
Graph Applications
1

Alfredo Rubin
[email protected]

Neo4j Inc. All rights reserved 2024
© 2023 Neo4j, Inc. All rights reserved.

Goal of Today





Demonstrate (with YOUR HELP :-)), how to build a
Graph “Application”
2

Neo4j Inc. All rights reserved 2024
© 2023 Neo4j, Inc. All rights reserved.

Agenda
1.Logistics
2.Introduction
3.Use Case Explanation
4.Building the Solution



5.Discussion: How to Improve the Use Case Further On
6.Final Q&A
7.Goodbye
3
BREAK (15min) Around 15:30

Neo4j Inc. All rights reserved 2024
4
Logistics
First things first …

Neo4j Inc. All rights reserved 2024
© 2023 Neo4j, Inc. All rights reserved.

Logistics
WIFI Access: WIFI-Name: neo4j | Password: graphsummit24
Restrooms: Out of the room and to
In the foyer area
Chargers:
Material of the workshop:

https://github.com/kvegter/gsummit2024



the left the rightStraight ahead
under on top underthe table in first three rowsx
5
x

Neo4j Inc. All rights reserved 2024
6
Introduction
A short overview of the Neo4j
Product

Neo4j Inc. All rights reserved 2024
Neo4j Native Graph Database - What is it?
7
Neo4j is a NORMAL
Database (DB)
●Comparable with other
DBs like Postgres,
MySQL, MariaDB,
Oracle, HANA DB, etc

●Securly save, insert,
update & query data

●Backup & Recovery

●Highly Scalable
Neo4j is “ACID Compliant”
and transaction save
●ACID (Atomicity,
Consistency, Isolation,
Durability)

●Data is stored
consistently on
transaction level
without any losses
during an outage

●Most relational DB’s
are ACID compliant,
too
*
* Source: Medium.com - Link to the Article
Neo4j is used 80%+ for
OLTP Workloads
●> 80% of Neo4j
Customers using the
database as normal
OLTP data store

●Data is stored,
changed, deleted and
queried

●With the same or even
better performance
compared to relational
database systems
Data
Data
DB

Neo4j Inc. All rights reserved 2024
What is the Value Using a Native Graph
Database?
8
How data is stored on disk
●Data is already stored as
connected

●Data is efficiently placed
on-disk using
“Index-free Adjacency”

●A GDB stores Nodes and
Relations instead of
Rows & Columns

●Semantics may be built
into the data!
How to query the data
●Cypher Query Language
vs. SQL (ISO -> GQL)

●Simple, lesser lines of
code, easier to read and
maintain

●Queries possible, that
span 100+ Hops in the
Graph, which is
comparable with SQL
Joins over 100+ Tables!
(:Product)-[:CONTAINS]->(:Part)
Storing complex data
networks & semantics
●Storing and analyzing
complex relations
between data

●Analyzing data from
traditionally siloed data
sources

●Extendable with data
science algorithms and AI

Neo4j Inc. All rights reserved 2024
DATA SOURCES
9
USE CASESINGEST
Apache
Hop
Structured
Unstructured
DATA ANALYTICS
DATA MANAGEMENT
Journey Analytics
Risk Analytics
Churn Analysis
What-if Analysis
Feature
Engineering & ML
Fraud
Recommendations
Data Fabric
Data Compliance
Data Governance
Data Provenance
Data Lineage
Next Best Case
Ontologies
Neo4j
Bloom
Neo4j
GDS Library
PRODUCT COMPONENTS
Neo4j Graph Platform
APOC
VISUALIZE
DRIVERS & APIs
Neo4j - Part of a Complete Ecosystem

Neo4j Inc. All rights reserved 2024
10
This is a Graph …

Neo4j Inc. All rights reserved 2024
11
This is a Graph, as well…

Neo4j Inc. All rights reserved 2024
12
…and this is also a Graph! Guess, what Graph could it be?
* https://www.graphable.ai/blog/patient-journey-mapping/

Neo4j Inc. All rights reserved 2024
Patient Journey Mapping - Digital Twin
Figure 2: A single patient’s clinical oncology journey
Figure 3: A patient graph schema supporting disease progression and cost exploration
* https://www.graphable.ai/blog/patient-journey-mapping/
Patient
Doctors Appointment
Insurance Claim
Results (ICD10 coded)
Doctor
Legend:
6 Month Journey of a Breast Cancer Grade 1 Patient

Neo4j Inc. All rights reserved 2024
14
Why Graph and how does it help?
“ Within the healthcare setting, a digital twin approach can
help clinicians and analysts model past and current
events and simulate how changes to operations, care,
procedures, and programs may influence future
outcomes. Digital twins traversing a patient journey map
help to reveal real-world obstacles, gaps in care and
processes, and other challenges faced by patients and
their families. ”

Neo4j Inc. All rights reserved 2024
15
The 3 P’s
Pandora Papers* Panama Papers*Paradise Papers*

Neo4j Inc. All rights reserved 2024
16
Profitability Paradise + Pandora Papers

Example Germany and other tax agencies
ROI Paradise/Pandora Papers
Pandora Papers:
•2021 BKA bought Pandora
Papers, Invest: 300k Euro til
July 2023 resulted in 38,4
Mio Euro additional tax
income
•ROI = 6400% over a 2 year
time frame (adding
operations cost of 300k Euro
as an approximation)
Paradise Papers:
●Worldwide $1.36 Billion in
additional tax income
●Germany 75 Mio. Euro
[Source]

Neo4j Inc. All rights reserved 2024
Nodes
•Can have Labels to classify Nodes
•Labels have native indexes
Relationships
•Connect nodes by Type and
Direction
Properties
•Attributes of Nodes & Relationships
•Stored as Name/Value pairs
•Can have indexes and composite
indexes



Labeled Property Graph Model
17
WRITES
name: “Dan”
born: 29th May, 1970
twitter: “@dan”
id: “AAJ73840”
name: Radio portable
date:1st
October
text:“I’m happy with
my purchase”
rate: 5
BUYS
HAS_REVIEW
Customer
Product
Review

Neo4j Inc. All rights reserved 2024
RDBMS vs Graph Model
18
RDBMS Model Graph Model
KNOWS
KNOWS
KNOWS
ANDREAS
TOBIAS
MICA
DELIA
Person FriendPerson-Friend
ANDREAS
DELIA
TOBIAS
MICA

Neo4j Inc. All rights reserved 2024
The Whiteboard Model Is the Physical Model
19

Neo4j Inc. All rights reserved 2024
20
The Graph Problem Problem
Many organizations don’t realize that they have a graph problem

Neo4j Inc. All rights reserved 2024
Adobe Behance
Social Network of 10M
Graphic Artists

Background
●Social network of 10M graphic artists
●Peer-to-peer evaluation of art and works-in-progress
●Job sourcing site for creatives
●Massive, millions of updates (reads & writes) to Activity Feed
●150 Mongos to 48 Cassandras to 3 Neo4j’s!
Business Problem
●Artists subscribe, appreciate and curate “galleries” of works of their own
and from other artists
●Activities Feed is how everyone receives updates
●1st implementation was 150 MongoDB instances
●2nd implementation shrunk to 48 Cassandras, but it was still too slow and
required heavy IT overhead
Solution and Benefits
●3rd implementation shrunk to 3 Neo4j instances
●Saved over $500k in annual AWS fees
●Reduced data footprint from 50TB to 40GB
●Significantly easier to introduce new features like, “New projects in your
Network”
21

Neo4j Inc. All rights reserved 2024
Graph Data Science
22

Neo4j Inc. All rights reserved 2024
Graph Algorithms Verticals
65+ graph algorithms in Neo4j
23

Neo4j Inc. All rights reserved 2024
24
Knowledge
Graphs

Neo4j Inc. All rights reserved 2024
:SAME_AS
User:VISITED
Website
User
IPLocation
Website
IPLocation
Website
Website :VISITED
:VISITED
:VISITED
:USED
:USED
:USED
:VISITED
:VISITED
:VISITED
Graphs allows you to make implicit
relationships….
….explicit
Graphs….Grow!
25
Website

Neo4j Inc. All rights reserved 2024
Knowledge Graphs….Grow!
:SAME_AS
User:VISITED
Website
User
IPLocation
Website
IPLocation
Website
Website
Website
:VISITED
:VISITED
:VISITED
:USED
:USED
:USED
:VISITED
:VISITED
:VISITED
User
:SAME_AS
:USED
:VISITED
PersonId: 1
PersonId: 1 PersonId: 1
User
:VISITED
…and can then group similar nodes…and
create a new graph from the explicit
relationships…
A graph grows organically - gaining
insights and enriching your data
PersonId: 2
26

Neo4j Inc. All rights reserved 2024
What can you do with a
knowledge graph?
Collaborative filtering: users who
bought X, also bought Y
(open-ended pattern matching)
What items make you more likely to
buy additional items in subsequent
transactions?
Traverse hierarchies - what items
are similar 4+ hops out?
How many flagged accounts
are in the applicant’s network
4+ hops out?
How many login / account
variables in common?
Add these metrics to your
approval process
What completes the
connections from genes to
diseases to targets?
What genes can be reached 4+
hops out from a known drug
target?
What mechanisms in common
are there between two drugs?
Financial Domain Life Sciences
Marketing and
Recommendations
27

Neo4j Inc. All rights reserved 2024 28
Knowledge Graphs
Structured
Unstructured
Ontologies
Graph Algorithms and
Graph Queries
Semantics,
Derived Relationships and
Additional Context
Natural
Relationships

Neo4j Inc. All rights reserved 2024 29

Large Language
Models

Neo4j Inc. All rights reserved 2024
LLMs save time and money

Customer Operations Marketing & Sales

Software Engineering

R&D

75% of GenAI value will come from these four areas


The economic potential of generative AI: The next productivity frontier, McKinsey & Company, June 2023.
30

Neo4j Inc. All rights reserved 2024
Lack of Enterprise Domain Knowledge
LLMs Challenges…
Outdated Public Sources Hallucination
Probabilistic
31

Neo4j Inc. All rights reserved 2024
Reliability
Ground LLMs in Neo4j’s Knowledge Graph
Improve Accuracy and Specificity
Security and Privacy
Probabilistic Deterministic Facts
Explainability
32

Neo4j Inc. All rights reserved 2024 33
Setting the context - From a user point of
view

Neo4j Inc. All rights reserved 2024
34
About Indexes
Indexes are used to 'land' in the Graph as quickly as possible
Normal Index Full Text Index Vector Index

Neo4j Inc. All rights reserved 2024 35
Vector Example
//
// Question input
//
with "Heb ik recht op een ov jaarkaart?" as vraag
//
// Question to embedding with OpenAI
//
WITH genai.vector.encode(vraag,"OpenAI",{token: $openaikey, model: $model } ) as embedding
//
// Vector search with the question-embedding as search parameter
//
CALL db.index.vector.queryNodes('tekst-embeddings', 5, embedding) YIELD node, score
//
// Now based on the nodes found in the vector index we can retrieve the context of the found nodes
//
MATCH p=(node)<-[*]-(:Wet)
RETURN score, node.tekst as text, p

Neo4j Inc. All rights reserved 2024 36
Vector Example
//
// Question input
//
with "Heb ik recht op een ov jaarkaart?" as vraag
//
// Question to embedding with OpenAI
//
WITH genai.vector.encode(vraag,"OpenAI",{token: $openaikey, model: $model } ) as embedding
//
// Vector search with the question-embedding as search parameter
//
CALL db.index.vector.queryNodes('tekst-embeddings', 5, embedding) YIELD node, score
//
// Now based on the nodes found in the vector index we can retrieve the context of the found nodes
//
MATCH p=(node)<-[*]-(:Wet)
RETURN score, node.tekst as text, p

Neo4j Inc. All rights reserved 2024
37
Data Insertion & Loading

Neo4j Inc. All rights reserved 2024
Data Insertion & Loading into Neo4j
●>>neo4j-admin import Command
●Cypher CSV Loader
●APOC
●ETL Tools (Apache Hop,…)
●Driver Connections
●3rd Party Platforms (Kafka, Spark)
More about Data loading, etc. on the Neo4j Documentation
38

Neo4j Inc. All rights reserved 2024
39
Data Visualization

Neo4j Inc. All rights reserved 2024
Neo4j Bloom - Neo4j Visualisation Platform
40

Neo4j Inc. All rights reserved 2024
NeoDash - Dashboarding with Your
Graph Data
41

Neo4j Inc. All rights reserved 2024
42
Use Case Explanation
Digital Twin - An Overview

Neo4j Inc. All rights reserved 2024
What is a Digital Twin?
A Digital Twin is a digital
representation of an intended or
actual real-world physical
product, system, or process (a
physical twin) that serves as the
effectively indistinguishable
digital counterpart of it for
practical purposes, such as
simulation, integration, testing,
monitoring and maintenance.
43

Neo4j Inc. All rights reserved 2024
44
* It has been done before
•Challenge: Legacy technology
could not track and analyze train
journeys

•Solution: Neo4j Knowledge Graph

•Identify and avoid bottlenecks (DB)
EU Railway Network -
Track & Trace

Neo4j Inc. All rights reserved 2024
Why do we need a Digital Twin? (few reasons)
45
Improved efficiency
It can optimize its operations and
reduce costs by simulating
different scenarios and making
data-driven decisions.
Enhanced safety
It can identify potential hazards
and test safety measures to
improve safety for passengers and
employees.
Predictive maintenance
It can monitor asset condition in
real-time, predict maintenance
needs, and increase asset lifespan.
Improved customer
experience
A digital twin can simulate disruptions
and help proactively address issues to
enhance the customer experience and
increase satisfaction.

Neo4j Inc. All rights reserved 2024
Another reason why a digital twin might
be helpful
46
https://www.euronews.com/travel/20
23/02/21/unspeakable-botch-spain-s
pends-258-million-on-trains-that-are-t
oo-big-for-its-tunnels

Neo4j Inc. All rights reserved 2024
POI’s Included
47

Neo4j Inc. All rights reserved 2024
Unleash the power of the Graph
48

Neo4j Inc. All rights reserved 2024
49
Abstracting from this
Use Case
Digital Twin and Networks are everywhere

Neo4j Inc. All rights reserved 2024
Many other use cases are networks or
Digital Twins, too!
-Supply Chain
-Mobile Phones Networks
-Power Grids / Gas Grids / Fibre Networks
-Social Networks
-Patient Journeys
-Bill of Materials (BoM)
-Variants Management Graphs (Hard- & Software BoM combined)
-etc.
50

Neo4j Inc. All rights reserved 2024
51
Mobile Network Operations - Large Mobile Provider
●Digital Twin of Mobile
Network
●Planning and operation
○Low signal strength of
tower
○Moving equipment
around
●Simulation of possible failure
scenarios
●Repair & maintenance
support
○Supply Chain Impact

Neo4j Inc. All rights reserved 2024
52
Plan maritime routes based on distances,
costs, and internal logic.

Results:

●Subsecond maritime routes planning
●Reduce global carbon emissions
60,000 tons
●12-16M ROI for OrbitMI customers

“We wanted to create a solution that exploits
artificial intelligence, integrates current and
historical AIS positions as well as multiple
data feeds and APIs. Such an effort would
require a world-class infrastructure. That’s
why we selected Neo4j.”
David Levy
Chief Marketing Officer
OrbitMI
Customer Case Study:
Logistics and Supply
Chain

Neo4j Inc. All rights reserved 2024
53
The Sample Dataset

Neo4j Inc. All rights reserved 2024
Sample Data: Railway Network Digital Twin Dataset
Provided Data:
-Consists of a Railway Track Network + Operation Point (OP)
-It includes attributes like:
-Tracks called Sections,
-Stations and other OPs,
-Geo location information,
-Section length and speed,
-other parameters

-Add on from us: Point of Interest (POI) along tracks

-Format: CSV → Generated from XML
-Origin of Data: https://data-interop.era.europa.eu/search
54

Neo4j Inc. All rights reserved 2024
The Data Explained - In an Easy Way
“The data can be seen as a highway that has ramps throughout its entire
path.

Each ramp can be an Operation Point (OP) like a Station, a Switch, etc.

Tracks have a length and a speed associated. Tracks have a start point and
an endpoint and will be inter-connecting Operation Points.

An Operation Point has a name and is referred to by a relationship with a
country code.

55

Neo4j Inc. All rights reserved 2024
Operation Points (OP) - Data Explanation (Nodes)
CSV Header titles:

●id: the internal number of the OP
●extralabel: the kind of OP we deal with, e.g. Station, Junction, Switch, etc.
●name: the name of a OP
●latitude: of the OP
●longtitude: of the OP
Station
Small
Station
Switch
Border
Point
Junction
Passenger
Stop
Geolocation
Geolocation
sectionlength
speed
56

Neo4j Inc. All rights reserved 2024
Section Connection Data Explanation (Relationships)
CSV Header titles:

●source: start OP for this section
●target: end OP for this section
●sectionlength: the length in km of that section
●trackspeed: max speed allowed on that section
Station
(source)
Small
Station
Switch
(target)
Border
Point
Junction
Passenger
Stop
sectionlength
trackspeed
57

Neo4j Inc. All rights reserved 2024
Point of Interest (POI) Data Explanation
CSV Header titles:

●CITY: City the POI is in or close by
●POI_DESCRIPTION: A short description of of the POI
●LINK_FOTO: a short name
●LINK_WEBSITE: is the name of the track corresponding to the shortcut
●LAT: Latitude of the POI
●LONG: Longitude of the POI
●SECRET: True if not well know, False if well known (e.g. Berlin)

Station / OP
Name
Small
Station
Switch
Border
Point
Junction
Passenger
Stop
POI
(can be outside the city)
Geolocation
58

Neo4j Inc. All rights reserved 2024
59
Data Modeling - Writing
down Questions &
Answers

Neo4j Inc. All rights reserved 2024
What is graph data modeling?
A collaborative effort where the application domain is analysed by
stakeholders and developers to come up with the optimal model for use
with Neo4j.

Stakeholders include:
●Business analysts
●Architects
●Managers
●Project leaders
●Data Scientists

Neo4j Inc. All rights reserved 2024
no one
EVER
gets it right
FIRST
time

Neo4j Inc. All rights reserved 2024
The Modeling Workflow
62
1. Derive the
question
2. Obtain the
data
3. Develop a
model
4. Ingest the
data
5. Query/Prove
the model

Neo4j Inc. All rights reserved 2024
Data Modeling - The Questions are the Input
1.How long is the journey in KM from station X to station Y?
2.How long in KM is an alternative route if a station on my journey is closed?
3.Do I have alternative routes if a station in my journey is down?
4.What is the shortest distance in KM between stations X and Y?
5.Can I geolocate stations in my static rail network?
6.What are the stations expecting the most traffic?
7.What POIs are along my route?
Note: Aggregations are also possible, but we focus on the “graphy” queries here!
63

Neo4j Inc. All rights reserved 2024
Expected Answers
1.Your journey is 30 stations and 239km long
2.Your alternative route is 36 stations and 271km
long
3.The shortest route between X and Y is XXX km
4.The geo location of your station is Lat / Long
5.The station with the most traffic is … ?
6.The following POIs are along your route …
64

Neo4j Inc. All rights reserved 2024
65
Data Modeling
Building the Model

Neo4j Inc. All rights reserved 2024
Tools for Graph Data Modeling
Arrows Tool → https://arrows.app
66

Neo4j Inc. All rights reserved 2024
Data Modeling Using the Questions as Input
1.How long is the journey in KM from station X to station Y?
2.How long in KM is an alternative route if a station on my journey is closed?
3.Do I have alternative routes if a station in my journey is down?
4.What is the shortest distance in KM between stations X and Y?
5.Can I geolocate stations in my static rail network?
6.What are the stations expecting the most traffic?
7.What POIs are along my route?
Remember those questions?
Blue → Nodes / Properties? Orange → Relationship Types
67

Neo4j Inc. All rights reserved 2024
Building a First Data Model 1/6
-Nouns in your questions are the nodes, verbs are the
relationships
-Nodes have a Label, Relationships MUST have a Type
-Properties should help to uniquely identify Nodes and/or
answer questions
68

Neo4j Inc. All rights reserved 2024
Building a First Data Model 2/6
-Nouns in your questions are the nodes, verbs are the
relationships
-Nodes have a Label, Relationships MUST have a Type
-Properties should help to uniquely identify Nodes and/or
answer questions
69

Neo4j Inc. All rights reserved 2024
Building a First Data Model 3/6
-Nouns in your questions are the nodes, verbs are the
relationships
-Nodes have a Label, Relationships MUST have a Type
-Properties should help to uniquely identify Nodes and/or
answer questions
=
70

Neo4j Inc. All rights reserved 2024
Building a First Data Model 4/6
-Nouns in your questions are the nodes, verbs are the
relationships
-Nodes have a Label, Relationships MUST have a Type
-Properties should help to uniquely identify Nodes and/or
answer questions
71

Neo4j Inc. All rights reserved 2024
Building a First Data Model 5/6
-Nouns in your questions are the nodes, verbs are the
relationships
-Nodes have a Label, Relationships MUST have a Type
-Properties should help to uniquely identify Nodes and/or
answer questions
72

Neo4j Inc. All rights reserved 2024
Building a First Data Model 6/6
-Nouns in your questions are the nodes, verbs are the
relationships
-Nodes have a Label, Relationships MUST have a Type
-Properties should help to uniquely identify Nodes and/or
answer questions
We will add more Labels to
this node, eg. Switch,
BorderPoint, etc.
73

Neo4j Inc. All rights reserved 2024
You DON’T have to do it again and again!
-Use Case Template repository*, that is growing
-https://neo4j.com/developer/industry-use-cases/finserv/
74
* And we have more templates to contribute

Neo4j Inc. All rights reserved 2024
75
Workshop Time
Let’s build the solution
together …

Neo4j Inc. All rights reserved 2024
Let’s Do it Together …
1.Create a Neo4j Graph instance via any of:
a.Neo4j Aura
i.You can spin up a Free aura instance here. If you already have one Aura Free db in use
and don't want to clear it can use the following option for this workshop
b.Neo4j Sandbox use a "Blank Sandbox"
c.Neo4j Desktop
i.If you are using Neo4j Desktop, you will need to ensure that APOC is added to any graph
you create. Installation instructions can be found here.

2.Open the Github page at: https://github.com/kvegter/gsummit2024
a.Open up the file: cypher/load-all-data.cypher
b.Copy and paste the file contents into your Neo4j browser.
.
https://console.neo4j.io/
76

Neo4j Inc. All rights reserved 2024
© 2023 Neo4j, Inc. All rights reserved.

77
(Basic) Cypher Hands-On
See cypher/all_queries.cypher for a cheat-sheet of
all the queries used in this workshop.

Neo4j Inc. All rights reserved 2024
Cypher: Powerful and Expressive Query Language
MATCH (:Person { name:“Dan”} ) -[:LOVES]-> (:Person { name:“Ann”} )
LOVES
Dan Ann
NODE NODE
LABEL PROPERTYLABEL PROPERTY
CREATE
RELATIONSHIP
MATCH <PATTERN> RETURN <DATA>
78

Neo4j Inc. All rights reserved 2024
// What is this Query Doing?
MATCH (a:Person)-[:FRIEND_OF]->(b:Person)<-[:FRIEND_OF]-(c:Person)
WHERE a.name = “Niels de Jong”
AND c.name = “Marco De Luca“
RETURN b.name as Name, b.id as FriendID
Cypher Check Up
79

Neo4j Inc. All rights reserved 2024
// Find the 50 Operation Points, Ordered By Neo4j internal id’s

Cypher Hands-on Questions
80

Neo4j Inc. All rights reserved 2024
// Find the 50 Operation Points, Ordered By Neo4j internal id’s
MATCH (op:OperationalPoint)
RETURN op
ORDER BY id(op)
LIMIT 50;

Cypher Hands-on Questions
81

Neo4j Inc. All rights reserved 2024
// Find the 50 Operation Points, Ordered By Neo4j internal id’s
MATCH (op:OperationalPoint)
RETURN op
ORDER BY id(op)
LIMIT 50;

// Find All Operation Points Names in the city of Amsterdam

Cypher Hands-on Questions
82

Neo4j Inc. All rights reserved 2024
// Find the 50 Operation Points, Ordered By Neo4j internal id’s
MATCH (op:OperationalPoint)
RETURN op
ORDER BY id(op)
LIMIT 50;

// Find All Operation Points Names in the city of Amsterdam
MATCH (op:OperationalPointName)
WHERE op.name CONTAINS 'Amsterdam'
RETURN op.name;
Cypher Hands-on Questions
83

Neo4j Inc. All rights reserved 2024
// Find the Station id's for Berlin and Paris
// HINT: Use the pattern (OperationalPointName)<-[:NAMED]-(Station)

Cypher Hands-on Questions
84

Neo4j Inc. All rights reserved 2024
// Find the Station id's for Berlin and Amsterdam
// HINT: Use the pattern (OperationalPointName)<-[:NAMED]-(Station)

Cypher Hands-on Questions
MATCH (n:OperationalPointName)<-[r:NAMED]-(station:Station)
WHERE n.name STARTS WITH 'Berlin'
RETURN n.name, station.id

MATCH (n:OperationalPointName)<-[r:NAMED]-(station:Station)
WHERE n.name STARTS WITH 'Amsterdam'
RETURN n.name, station.id
85

Neo4j Inc. All rights reserved 2024
// Find the Shortest path between Frankfurt and Amsterdam
(Frankfurt-id: DE000FF, Amsterdam-id: NLASD)
TIP: use shortestPath
Cypher Hands-on Questions
MATCH (op1:OperationalPoint WHERE op1.id = 'NLASD')
MATCH (op2:OperationalPoint WHERE op2.id = 'DE000FF')
MATCH sp=shortestPath((op1)-[:SECTION*]-(op2))
RETURN sp;
86

Neo4j Inc. All rights reserved 2024
// Find the Shortest path between Amsterdam (POI) and Berlin (POI)
Cypher Hands-on Questions
MATCH(a:POI), (b:POI)
WHERE toLower(a.city) CONTAINS "amsterdam"
AND toLower(b.city) CONTAINS "berlin"
WITH a, b
MATCH p = shortestPath((a)-[*]-(b))
RETURN p


87

Neo4j Inc. All rights reserved 2024
© 2023 Neo4j, Inc. All rights reserved.

88
Build your own Graph Solution
with NeoDash




Go to the github page here and follow instructions:

Neo4j Inc. All rights reserved 2024
NeoDash
Open http://neodash.graphapp.io/ when you have a local
database

Open https://neodash.graphapp.io/ when you have a sandbox or
aura db

Import the dashboard from GitHub



89

Neo4j Inc. All rights reserved 2024
How Could this “Mini Digital Twin” be Enhanced?
Extending the Graph is easy, e.g. with further track information:
-For example with possible pace on track, track quality metrics, mobile network
coverage, etc.
-Sensor information about switches to quickly find problems and re-route traffic
-The more related information is added, the more use cases could benefit from
the digital twin

Going even further:
-Additional information about trains running on the tracks enable additional
benefit:
-Track and schedule optimization
-Simulation of new track (to grow traffic)
90

Neo4j Inc. All rights reserved 2024
91
Q & A

Neo4j Inc. All rights reserved 2024
92
Thanks for Graphing
with me!
Contact us:
[email protected]

Neo4j Inc. All rights reserved 2024
93
Additional Information

Neo4j Inc. All rights reserved 2024
Neo4j Graph Data Platform Overview
Analytics
Tooling
Graph Transactions
Data Integration
Dev.
& Admin
Drivers & APIs Discovery & Visualization
Graph Analytics &
Data Science
Developers
Admins
Applications Business Users
Data Analysts
Data Scientists
Bloom
94

Neo4j Inc. All rights reserved 2024
Cypher
MATCH (u:Customer
{customer_id:'customer-one'})-[:BOUGHT]->(p:Product)<-
[:BOUGHT]-(peer:Customer)-[:BOUGHT]->(reco:Product)
WHERE not (u)-[:BOUGHT]->(reco)
RETURN reco as Recommendation, count(*) as Frequency
ORDER BY Frequency DESC LIMIT 5;
SQL
SELECT product.product_name AS Recommendation,
count(1) AS Frequency
FROM product, customer_product_mapping, ( SELECT
cpm3.product_id, cpm3.customer_id
FROM Customer_product_mapping cpm,
Customer_product_mapping cpm2,
Customer_product_mapping cpm3
WHERE cpm.customer_id = ‘customer-one’
AND cpm.product_id = cpm2.product_id
AND cpm2.customer_id != ‘customer-one’
AND cpm3.customer_id = cpm2.customer_id
AND cpm3.product_id NOT IN (SELECT DISTINCT
product_id
FROM Customer_product_mapping cpm
WHERE cpm.customer_id = ‘customer-one’)
) recommended_products
WHERE customer_product_mapping.product_id =
product.product_id
AND customer_product_mapping.product_id IN
recommended_products.product_id
AND customer_product_mapping.customer_id =
recommended_products.customer_id
GROUP BY product.product_name
ORDER BY Frequency DESC
95

Neo4j Inc. All rights reserved 2024
What Does the Sample Data Looks
Like?
●Operation Points Data*:
●Tracks & Speed Data*:
●Point-of-Interest Data:
○https://github.com/neo4j-field/gsummit2023/tree/main/data

* Translation of Shortcuts and certain Operational Units into english done by the Neo4j team!
Origin of data:
○https://data-interop.era.europa.eu/search

The data set is public available and can be used under the license and conditions mentioned by OpenDB. We have
translated the data to make it better understandable to everyone outside Germany.
96

Neo4j Inc. All rights reserved 2024 97
Building Our Solution

Neo4j Inc. All rights reserved 2024
High Level Approach Building a Graph Solution
1. DOMAIN

Understand the domain you
try to model
5. First Data Model

Build your first data model with all
stakeholders involved and load sample data
2. Sample Data

Get accurate sample data
you understand
4.Identify entities & connections

Find entities & connections that
are part of your data model
3. Q & A from Business

Define Questions & Answers
the Business wants to
understand
Graph Solution -
First Phase
Go to next steps

Neo4j Inc. All rights reserved 2024
High Level Approach Building a Graph Solution
9. Interactive Components

Build dashboards, Bloom
perspectives, Jupyter NBs, or
other interactive components to
demonstrate your graph data.
8. Scalability

If possible, test scalability. If
not make sure your data
model does scale.
7. Refine Data Model

Refine your Data Model
eventually, if it improves
answers
6. Test questions

Test your questions against
your model and data by
writing Cypher queries
Graph Solution -
Second Phase
coming from prev. steps

Neo4j Inc. All rights reserved 2024 100
Stakeholders Recommended to Build a Graph Solution
●Maintain / extend graph
●Help to precise data model
objects like labels,
relationships, etc.
●Build UIs, Dashboards,
etc.
●Know what is missing today
●Build and operate data
loading (ETL process)
●Provide answers and rating
for results to above questions
●Build the graph
●Provide questions they want
to ask
●Translate questions into
queries / scripts
●Add domain knowledge
Domain Experts Consultants / Developers