Unleashing the Power of Connections: Neo4j Mastery for Enterprise-Scale Graph Solutions
neo4j
50 views
75 slides
Oct 18, 2024
Slide 1 of 75
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
About This Presentation
Unleashing the Power of Connections:
Neo4j Mastery for Enterprise-Scale Graph Solutions
Size: 8.26 MB
Language: en
Added: Oct 18, 2024
Slides: 75 pages
Slide Content
Unleashing the Power of
Connections:
Neo4j Mastery for Enterprise-Scale
Graph Solutions
Bryan Lee, Professional Services Architect
Agenda
●Introduction to property graph
●(Introduction)-[:to]->(Cypher)
●Graph best practices
﹣Graph modelling workflow
﹣Graph modelling for performance
●Data integration
●Visualization
●Q & A
Neo4j Inc. All rights reserved 20242
Introduction
What is a property graph?
About me
WORKS_AT
name: Neo4j
role: Consulting
Engineer
name: Bryan Lee
age: 30
LIVES_IN
country: Singapore
town: Tampines
PLAYS
sports: Badminton
STUDIED
course: Materials Engineering
Neo4j Inc. All rights reserved 20244
you only need to know
things
4
Neo4j Inc. All rights reserved 20245
Graph components
Node (Vertex)
•The main data element from which graphs are constructed
Keanu
Reeves
The
Matrix
Neo4j Inc. All rights reserved 20246
Graph components
Node (Vertex)
•The main data element from which graphs are constructed
Relationship (Edge)
•A link between two nodes
•Direction
•Type
ACTED_IN
Keanu
Reeves
The
Matrix
A node without
relationships is
permitted, a
relationship
without nodes is
not
Neo4j Inc. All rights reserved 20247
Node (Vertex)
Relationship (Edge)
Label
•Define node role (optional)
Property graph database
Person MovieACTED_IN
Keanu
Reeves
The
Matrix
Neo4j Inc. All rights reserved 20248
Property graph database
Node (Vertex)
Relationship (Edge)
Label
•Define node role (optional)
•Can have more than one
Person
Action
MovieACTED_IN
Neo4j Inc. All rights reserved 20249
Property graph database
Node (Vertex)
Relationship (Edge)
Label
•Define node role (optional)
•Can have more than one
Properties
•Enrich
•nodes
•relationships
name: Keanu Reeves
title: The Matrix
released: 1999
tagline: Welcome…
role: Neo
Person
Action
Movie
ACTED_IN
Neo4j Inc. All rights reserved 202410
Conceptual Mapping Relational → Graph
Relational
Rows
Joins
Table Names
Columns
Graph
Nodes
Relationships
Labels
Properties
Neo4j Inc. All rights reserved 202411
(Intro)-[to]->(Cypher)
Neo4j Inc. All rights reserved 202412
Overview of Cypher
(:Person {name: 'Ann'}) -[:OWNS]-> (:Car {brand: 'Volvo'})
Cypher Example:
1
ASCII Art code
styling
It’s a Declarative
language
2
Similar to SQL, but
not SQL!
3
Neo4j Inc. All rights reserved 202413
Simplest possible pattern: a single node
(:Person {name:'Alice'})
No upper limit on complexity
()--()--()--()--()
Neo4j Inc. All rights reserved 202417
Cypher: Powerful and expressive query
language
:LOVES
CREATE (:Person {name: 'Jane'})
NODE
Label Property
NODE
Label Property
RELATIONSHIP
-[:LOVES]-> (:Person { name: 'Arlo' })
name: 'Jane' name: 'Arlo'
Person Person
Neo4j Inc. All rights reserved 202418
Cypher: Matching
:MARRIED_TO
MATCH (p:Person {name: 'Jane'})
NODE
Label Property
NODE
Variable
RELATIONSHIP
-[:MARRIED_TO]->(spouse:Person)
name: 'Jane'
Variable
RETURN p, spouse
Person Person
Neo4j Inc. All rights reserved 202419
GQL: The ISO Standard
for Graphs has Arrived
Neo4j Inc. All rights reserved 2024
Neo4j Inc. All rights reserved 202421
Graph Database
Flexible, scale-out store/retrieve + analytics
Optimized for data connections, real-time transactions and analytics
2013+
First database
language standard in
40 years.
Natively store & query relationships
Queries run 1000x faster at scale, up to 1000+ hops
GQL
2005+
NoSQL Database
Scale-out, simple model
store/retrieve
Optimized for application-driven, volume throughput
Relational Database
Structured, schema-based
store/retrieve
Optimized for real-time transactional processing/reporting
1980+
App
Based
SQL
Why do you need to know about Graph Database?
Graph Best Practices
Model & Query
23
What is graph data modeling?
A collaborative effort where the application domain is analysed by
stakeholders and developers to come up with the optimal model for use
with Neo4j.
Stakeholders include:
●Business analysts
●Architects
●Managers
●Project leaders
●Data Scientists
Neo4j Inc. All rights reserved 2024
The Modeling Workflow
1. Derive the
question
2. Obtain the
data
3. Develop a
model
4. Ingest the
data
5. Query/Prove
the model
Neo4j Inc. All rights reserved 202424
no
oneEVE
R
gets it
rightFIRS
T
tim
e 25
Digital Twin In Action
Use Case Explanation
What is a Digital Twin?
A Digital Twin is a digital
representation of a […] real-world
physical product, system, or process
[…] that serves as the effectively
indistinguishable digital counterpart of
it for practical purposes, such as
simulation, integration, testing,
monitoring and maintenance.
Neo4j Inc. All rights reserved 202427
Graph Modelling
Derive the question
AI
Breakthroughs
Oil & Gas
Production Vessels
Maintenance
Records
Alerts
Oil Wells
Monitoring Sensors
Equipments
Oil & Gas Pipelines
Batteries/Collection Point
Components
Digital Twin
Neo4j Inc. All rights reserved 202429
Data Modeling – Example Domain Questions
1.What are the Equipments and Vessels related to Wells with the top 5 production yield?
﹣ Wells with high production rates may place more strain on separators and compressors, requiring
more frequent maintenance
2.Which alternative Wells could be used to maintain production levels if the production
capacity of a Pipeline is reduced?
﹣ A Well is under maintenance and we need to maintain production levels
3.How many components are affected if I need to upgrade a Vessel?
﹣ Which assets will be affected if we take a particular well or pipeline offline for maintenance?
Neo4j Inc. All rights reserved 202430
Graph Modelling
Identifying entities and connections
Identify Entities from Questions
Entities are the nouns in the domain questions:
1. What ingredients are used in a recipe?
2. Who is married to this person?
●The generic nouns often become labels in the model
●Use domain knowledge deciding how to further group or differentiate entities
Person
Recipe
Ingredi
ent
Neo4j Inc. All rights reserved 202432
Identify Connections between Entities
Connections are the verbs in the domain questions:
●What ingredients are used in a recipe?
●Who is married to this person?
Recipe
Ingredi
ent
USES
PersonMARRIEDPerson
Neo4j Inc. All rights reserved 202433
Using question 1 …
1.Which Equipments and Vessels are located at Battery Fusion 1-A?
﹣ Understanding which equipment and vessels are present at a specific battery (e.g., Battery Fusion 1-A) is critical for
effective operational management.
﹣ If Battery Fusion 1-A has a high production rate, this can place additional strain on separators, compressors, and
other equipment.
2.What are the maintenance records for Equipments and Vessels?
﹣ Planning future maintenance activities becomes more efficient, especially if there’s a need to balance production
demands with maintenance downtimes.
3.What are all the alerts for each Battery?
﹣ Prioritize alerts that affect production rates or safety, ensuring that high-risk issues are addressed promptly.
﹣ Use predictive maintenance insights to schedule repairs before issues escalate, using alerts as indicators of
equipment health.
Neo4j Inc. All rights reserved 202434
1.Which Equipments and Vessels are located at Battery Fusion 1-A?
﹣ Understanding which equipment and vessels are present at a specific battery (e.g., Battery Fusion 1-A) is critical for
effective operational management.
﹣ If Battery Fusion 1-A has a high production rate, this can place additional strain on separators, compressors, and
other equipment.
Using question 1 …
LOCATED_AT
Equipment Battery Vessel
LOCATED_AT
Fusion 1-A
Free Water
Knock Out
Oil Compressor
Neo4j Inc. All rights reserved 202435
Using question 2 …
1.Which Equipments and Vessels are located at Batteries with the highest yield?
﹣ Understanding which equipment and vessels are present at a specific battery (e.g., Battery Fusion 1-A) is critical for
effective operational management.
﹣ If Battery Fusion 1-A has a high production rate, this can place additional strain on separators, compressors, and
other equipment.
2.What are the maintenance records for Equipments and Vessels?
﹣ Planning future maintenance activities becomes more efficient, especially if there’s a need to balance production
demands with maintenance downtimes.
3.How many components are affected if I need to upgrade a Vessel?
﹣ Prioritize alerts that affect production rates or safety, ensuring that high-risk issues are addressed promptly.
﹣ Use predictive maintenance insights to schedule repairs before issues escalate, using alerts as indicators of
equipment health.
Neo4j Inc. All rights reserved 202436
2.What are the maintenance records for Equipments and Vessels?
﹣ Planning future maintenance activities becomes more efficient, especially if there’s a need to balance production
demands with maintenance downtimes.
Using question 2 …
LOCATED_AT
Equipment Battery Vessel
LOCATED_AT
Fusion 1-A
Free Water
Knock Out
Oil Compressor
HAS_MAINTENANCE
_RECORD
HAS_MAINTENANCE
_RECORD
Maintenance
Record
Maintenance
Record
Neo4j Inc. All rights reserved 202437
Using question 3 …
1.Which Equipments and Vessels are located at Battery Fusion 1-A?
﹣ Understanding which equipment and vessels are present at a specific battery (e.g., Battery Fusion 1-A) is critical for
effective operational management.
﹣ If Battery Fusion 1-A has a high production rate, this can place additional strain on separators, compressors, and
other equipment.
2.What are the maintenance records for Equipments and Vessels?
﹣ Planning future maintenance activities becomes more efficient, especially if there’s a need to balance production
demands with maintenance downtimes.
3.What are all the alerts for each Battery?
﹣ Prioritize alerts that affect production rates or safety, ensuring that high-risk issues are addressed promptly.
﹣ Use predictive maintenance insights to schedule repairs before issues escalate, using alerts as indicators of
equipment health.
Neo4j Inc. All rights reserved 202438
3.What are all the alerts for each Battery?
﹣ Prioritize alerts that affect production rates or safety, ensuring that high-risk issues are addressed promptly.
﹣ Use predictive maintenance insights to schedule repairs before issues escalate, using alerts as indicators of
equipment health.
Using question 3 …
LOCATED_AT
Equipment Battery Vessel
LOCATED_AT
Fusion 1-A
Free Water
Knock Out
Oil Compressor
HAS_MAINTENANCE
_RECORD
HAS_MAINTENANCE
_RECORDMaintenance
Record
Maintenance
Record
Alert
HAS_ALERT
Methane Alert
Inspection
Periodic
Maintenance
Neo4j Inc. All rights reserved 202439
Graph Modelling
Develop a model for performance
Best Practices
1.Avoid Dense nodes
2.Create Indexes/Constraints
3.Avoid Cartesian product
Neo4j Inc. All rights reserved 202441
Best Practices
1.Avoid Dense nodes
2.Create Indexes/Constraints
3.Avoid Cartesian product
Neo4j Inc. All rights reserved 202442
Dense nodes
Avoid high density nodes using intermediate nodes
Neo4j Inc. All rights reserved 202443
Why are super nodes bad?
•Blowing up your traversals
MATCH (bob:TwitterUser { name: "bob" })
WITH bob
MATCH p=(bob)-[:FOLLOWS*]-(:TwitterUser { name: "karla" })
RETURN p
•Slowing down writes
Merge on relationship locks the node on both sides
•Memory issues
Out of memory & large GC cycles
Neo4j Inc. All rights reserved 202444
So what do we do?
•Relationship directionality
(:User)-[:FOLLOWS]-(:User { name: "ladygaga" }) (80 million + relationships)
(:User)<-[:FOLLOWS]-(:User { name: "ladygaga" }) (121,000 relationships)
•Join hints
MATCH (me:Person {name:'Me'})-[:FRIENDS_WITH]-(friend)-[:LIKES]->(a:Artist)<-[:LIKES]-(me)
USING JOIN on a
RETURN a, count(a) as likesInCommon
•Label and relationship segregation
(:TwitterUser) for the regular users
(:Celebrity) or (:VerifiedUser) for the mega-accounts
•Super node refactoring
(:Airport)-[HAS_DAY]->(AirportDay)-[:FLIGHT]->(Destination)
(user:TwitterUser)-[:FAN_OF]->(gaga:TwitterUser)
45
Best Practices
1.Avoid Dense nodes
2.Create Indexes/Constraints
3.Avoid Cartesian product
Neo4j Inc. All rights reserved 202446
How are indexes used in neo4j?
Indexes are only used to find the starting points for queries.
Use index scans to look up
rows in tables and join them
with rows from other tables
Use indexes to find the
starting points for a query.
Relational Graph
Neo4j Inc. All rights reserved 202447
Indexes in Neo4j
Neo4j Inc. All rights reserved 202448
Faster Lookups
Indexes help Neo4j to locate data more
efficiently by directing queries to specific
parts of the graph, avoiding full scans.
Indexes allow Neo4j to bypass irrelevant
nodes when looking for specific labels of
properties, speeding up execution.
Optimized Query Execution
Reduced Search Time
Index drastically cut down the number of
nodes or relationships scanned, retrieving
data directly from indexed fields.
Lower Resource Consumption
Faster lookups reduce CPU and memory
usage, improve scalability, especially for
large graphs and complex queries.
Neo4j Inc. All rights reserved 202449
Unique Constraints Key Constraints
Ensures unique property values across
nodes or relationships for a specific
label/type. Can Apply to a single or multiple
properties.
Ensures that specific properties exists on all
nodes or relationships, and their
combination is unique
Guarantees that required properties exist on
all nodes or relationships with a given label
or type.
Existence Constraints Type Constraints
Ensures that properties on nodes or
relationships have the required data type.
Constraints in Neo4j
Best Practices
1.Avoid Dense nodes
2.Create Indexes/Constraints
3.Controlling Cardinalities
Neo4j Inc. All rights reserved 202450
Controlling Cardinalities
●Input rows into an operator
○= times the operator is executed
○= multiplies the number of db-hits and individual output rows
●Getting output rows down, reduces input rows for next operator
○e.g. you're only interested in distinct results, not number of paths, get the data/rows
down to distinct in-between values.
○Or get the number of times a index lookup is done down
○Or use the label or index with the higher selectivity to drive the query
Neo4j Inc. All rights reserved 202451
Controlling Cardinalities
Avoid Cross Products like this
MATCH (m:Movie), (p:Person)
RETURN count(distinct m), count(distinct p)
Instead do one MATCH at a time
MATCH (m:Movie) WITH count(*) AS movies
MATCH (p:Person) RETURN movies,
count(*) AS people
Neo4j Inc. All rights reserved 202452
Visualization
Neo4j Bloom
Data Visualization with Neo4j Bloom
Neo4j’s user-friendly graph database visualization, exploration and collaboration
tool.
●Visually explore graphs
●Prototype faster
●Visualize and discover
●Easy for non-technical users
Neo4j Inc. All rights reserved 202454
Who is Bloom for?
Graphs make sense and are useful to more people
Analyst
Developer
Data
Scientist
Neo4j Inc. All rights reserved 202455
Neo4j Bloom Deployments
●Workspace - Explore powered by Bloom
﹣Bloom Basic features available from the Explore tab in the free and
convenient Workspace application on Aura Free and Professional
●Aura
﹣Alternatively, use Bloom Basic features included with all Aura tiers as a
standalone application available within the Aura console (bloom.neo4j.io)
﹣Bloom Enterprise is available to Aura Enterprise customers under a free EAP
●Self-managed
﹣With a paid license, choose to host Bloom with all it’s Enterprise features on
your own web server or access it directly from the Bloom Enterprise plugin
Neo4j Inc. All rights reserved 202456
Perspectives
●A business view of the graph
●Perspective contains:
﹣Node & Relationship categories
﹣Default Styling
﹣Saved Cypher
Neo4j Inc. All rights reserved 202457
Search in Bloom
Empty search box suggests that
user can enter a category or
relationship type
Subtle hints (Tab to auto
complete or Return to run)
Array of suggestions when you
type with a typeahead hint
UI tweaks to better signal what
each suggestion means
Proactive pattern match
suggestions appear when auto
completing on a graph element
or value
Pattern appears as-is in search
box (WYSIWYG)
Neo4j Inc. All rights reserved 202458
Layouts in Bloom
●Hierarchical layout
●Vertical and horizontal options for top
down, process or dependency graphs
●Standard layout
●Standard force-directed layout -
efficient for displaying graphs of all
sizes and complexities
Neo4j Inc. All rights reserved 2024
59
Search Phrases
Search Phrases
●Create search phrases that users
can type into the search bar or
invoke via a deep link to quickly
execute custom Cypher
●Can include parameters
●Stored in the Perspective
60
●A different way to interact with your graph
(DevRel Blog on Medium)
●Think Search phrases, but using selections as
input
○Invoked from context menu
○Uses the implicit parameters $nodes and
$relationships : Bloom passes the values of
the parameters as Lists of ids
Neo4j Inc. All rights reserved 202461
Scene Actions
●Editing, Filtering, Rule-based styling
●Translate timezones in the scene to any chosen offset
Neo4j Inc. All rights reserved 202462
Date/Time & Timezones
Deep Linking
●Link into Bloom from another
application
●Pass-in context to prime the
search suggestions (pattern or
complex search phrase) or to run
a query
●Control the perspective shown
View in Bloom
Neo4j Inc. All rights reserved 202463
Neo4j Inc. All rights reserved 202464
Scene Saving & Sharing - Bloom Enterprise
●Save multiple scenes
●Continue work later
●Share them with colleagues - enhanced
productivity
Graph Data Science Integration
●Run select GDS
algorithms from
within Bloom
●Results are not
written to the DB
(stored in scenes if
using Full Access)
●Automatically apply
styling (color & size)
Neo4j Inc. All rights reserved 202465
Visualization
NeoDash
What is NeoDash?
NeoDash is an open-source, low-code Dashboard Builder for Neo4j.
NeoDash is set to become an officially supported product.
Build interactive dashboards with tables, graphs, bar/line/pie charts, maps and more,
in minutes.
Neo4j Inc. All rights reserved 202467
??????
>1000 active
users
??????
>100 countries
⭐
37 contributors
??????
Neo4j Labs Project*
User Quotes
Neo4j Inc. All rights reserved 202468
“This is super! NeoDash lets us build a web
interface within minutes, compared to weeks with
our existing tooling.”
“We love the concept! We’ve build a number of
dashboards to visualize our internal business
structure as a graph, and we deployed it to 100
users on AWS.”
“NeoDash enables us to make our complex data
model visible so quickly. The ability to link into
Bloom is a great value add and immensely
accelerates our project.”
“Neo4j PS helped us customize the application to
our organization’s style, and add custom
visualizations for our use-case.”
Where does NeoDash fit in?
Neo4j Inc. All rights reserved 202469
vs Neo4j Browser or Bloom
Exploration
Development /
Administration
Browser
Reporting
Neo4j Bloom
NeoDash
Where does NeoDash fit in?
Neo4j Inc. All rights reserved 202470
Exploration
Development /
Administration
Browser
Reporting
Neo4j Bloom
NeoDash
For developers / DBAs
- Ad-hoc queries
- Small graph views
- Testing
- Simple styling
Where does NeoDash fit in?
Neo4j Inc. All rights reserved 202471
Exploration
Development /
Administration
Browser
Reporting
Neo4j Bloom
NeoDash
Graph exploration by Data
Analysts
- No-code (search UI)
- Larger graphs
- Powerful graph styling
- Interactive drilldowns
Where does NeoDash fit in?
Neo4j Inc. All rights reserved 202472
Curated graph reports for domain
experts
- Low-code dashboards
- Tailored views
- High customizability
Reporting
NeoDash
Exploration
Neo4j Bloom
Development /
Administration
Browser
Dashboard Editor
NeoDash cuts down time-to-value by being
a true low-code application:
●Use Cypher queries to populate
reports
●Drag-and-drop interface
●Build and publish dashboards in a
matter of minutes
Neo4j Inc. All rights reserved 202473
Example Use Cases
Neo4j Inc. All rights reserved 202474
Explainable AI / Graph Data Science
Data Analysis
Monitoring / Root-Cause Analysis
Data Validation