Representing Seismic Metadata with Relational Knowledge Graphs

w1davis 10 views 32 slides Sep 01, 2024
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

Representing Seismic Metadata with Relational Knowledge Graphs


Slide Content

Representing Seismic Metadata with Relational
Knowledge Graphs
William Davis
*
& Cassandra Hunt

*
Institute of Geophysics and Planetary Physics,
Scripps Institution of Oceanography,
University of California, San Diego

RelationalAI, USA
9th November 2022
Contacts:
[email protected]
[email protected]

Motivation
Difficulties working with seismic data
Seismologists, has this ever happened to you?
•Expectation: Eager to investigate scientific questions using
seismic data
•Reality: End up spending most of your time data wrangling
•Dealing with lots of different file formats, catalogues, data
sources
•Data providers: coarse grained queries. Download more than
you need, filter what’s useful on your PC
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 1

Motivation
Difficulties working with seismic data
The problems with seismic data
Or: The mess we’ve gotten ourselves into
•Seismic data is rarely simple to work with
•So many file formats
•SAC
•SEED
•ASDF
•SeisGram (AScII)
•StationXML
•NDK
•Lots of data wrangling before any “science” can be done
•Routine workflows might be fine; exploratory data analysis
suffers
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 2

Motivation
Difficult data and complex questions
Hypothetical example
•Half the station metadata is stationXML, half is data-less
SEED
•CMT data split over multiple catalogues
•Want to get specific data, e.g.,
•Find large magnitude earthquakes that have many small
aftershocks
•Find repeating earthquakes
•Specific combinations of instrumentation
•How can one effectively & efficiently approach these questions
with diverse data sources?
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 3

Motivation
Universal seismic data standards?
Motivates a modern, inter-operable, flexible approach
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 4

Proposed Solution
Approach by RelationalAI
A modern approach by RelationalAI
•Accept the data diversity
•Tools developed for the modern data stack
•RKGMS: Relational Knowledge Graph Management System
•Extract data and integrate into a common representation
•Define rules from domain logic; automatically check for
correctness
•Write complex queries with ease
•Get to the data analysis (or fleal science”) as quickly and
painlessly as possible
•Implemented with Rel: an expressive, composable,declarative
database language
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 5

Proposed Solution
My internship at RelationalAI
What I attempted during my internship
•Build two relational knowledge graphs for two kinds of seismic
data
•Station metadata (StationXML)
•Earthquake source data (various CSV formats, from USGS,
NCEDC, etc. . . )
•Applying domain logic & checking correctness
•Querying the graphs to get desired data
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 6

Modelling Seismic Knowledge
Knowledge graphs
Knowledge graphs
•A model for representing structure and relationships
•Objects are nodes, relations are connections
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 7

Modelling Seismic Knowledge
Knowledge graphs
Knowledge graphs
•A model for representing structure and relationships
•Objects are nodes, relations are connections(E.g., BHZ)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 7

Modelling Seismic Knowledge
Integrity constraints
What do I mean by “domain logic” or
constraints?
Rules constraining allowed/intended structure and relationships.
From prior knowledge, or specifications (e.g., FDSN’s stationXML
documentation)
•Networks manage stations
•Every station is managed by a network
•Stations have ID codes
•Within a network, stations have unique ID codes***
Implemented in Rel withintegrity constraints.
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 8

Modelling Seismic Knowledge
Integrity constraints
What do I mean by “domain logic” or
constraints?
Rules constraining allowed/intended structure and relationships.
From prior knowledge, or specifications (e.g., FDSN’s stationXML
documentation)
•Networks manage stations
•Every station is managed by a network
•Stations have ID codes
•Within a network, stations have unique ID codes***
Implemented in Rel withintegrity constraints.
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 8

Object-Role Modelling
Networks, stations, etc. . .
Knowledge graphs and logical constraints can be conveyed with
Object-Role Model (ORM) diagrams (Halpin, 2015), with:
•Entities (things that exist) and relations•Values (attributes of entities)•Constraints (mandatory/unique/external)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 9

Object-Role Modelling
Networks, stations, etc. . .
Knowledge graphs and logical constraints can be conveyed with
Object-Role Model (ORM) diagrams (Halpin, 2015), with:
•Entities (things that exist) and relations•Values (attributes of entities)•Constraints (mandatory/unique/external)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 9

Object-Role Modelling
Networks, stations, etc. . .
Knowledge graphs and logical constraints can be conveyed with
Object-Role Model (ORM) diagrams (Halpin, 2015), with:
•Entities (things that exist) and relations•Values (attributes of entities)•Constraints (mandatory/unique/external)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 9

Object-Role Modelling
Networks, stations, etc. . .
Knowledge graphs and logical constraints can be conveyed with
Object-Role Model (ORM) diagrams (Halpin, 2015), with:
•Entities (things that exist) and relations•Values (attributes of entities)•Constraints (mandatory/unique/external)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 9

Object-Role Modelling
Catalogues and events
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 10

Application
Loading station and event data
Applying this to real data
•Knowledge graphs:
•Station metadata: StationXML from IRIS
•Earthquake source data: USGS (CSV), NCEDC (CSV), &
GCMT (NDK)
•Extract and transform data with Rel
•Station metadata:seis
•Source data:source
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 11

Application
Station and event data
Visualizing loaded data
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 12

Application
Integrity constraints
Validated the expected structure of the data with Rel’sintegrity
constraints.
•Networks manage stations
•Every station is managed by a network•Stations have id codes•Within a network, stations have unique id codes*** (more
complex)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 13

Application
Integrity constraints
Validated the expected structure of the data with Rel’sintegrity
constraints.
•Networks manage stations
•Every station is managed by a network•Stations have id codes•Within a network, stations have unique id codes*** (more
complex)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 13

Application
Integrity constraints
Validated the expected structure of the data with Rel’sintegrity
constraints.
•Networks manage stations
•Every station is managed by a network•Stations have id codes•Within a network, stations have unique id codes*** (more
complex)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 13

Application
Integrity constraints
Validated the expected structure of the data with Rel’sintegrity
constraints.
•Networks manage stations
•Every station is managed by a network•Stations have id codes•Within a network, stations have unique id codes*** (more
complex)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 13

Application
Integrity constraints
Validated the expected structure of the data with Rel’sintegrity
constraints.
•Networks manage stations
•Every station is managed by a network•Stations have id codes•Within a network, stations have unique id codes*** (more
complex)
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 13

Application
Composable queries
Build a toolbox of composable queries
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 14

Application
Composable queries
Build a toolbox of composable queries
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 14

Application
Composable queries
Build a toolbox of composable queries
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 14

Application
Querying events
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 15

Application
Querying events
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 16

Application
Querying events
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 17

Application
Querying links between stations and events
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 18

Application
Querying links between stations and events
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 19

Summary
Key points
•Able to integrate multiple sources of data into one flexible
framework
•Check for correctness with integrity constraints
•Compose complex queries with ease
•(Anecdotally) good performance and scaling
•Tested on∼500k events
•Quantitative benchmarking coming soon
•Presenting this work at AGU
•Session IN16A: Knowledge Graph, Machine Learning, and
Artificial Intelligence in Geosciences (Monday)
Contacts:
[email protected] [email protected]
Motivation
W. Davis & C. Hunt — Representing Seismic Metadata with Relational Knowledge Graphs 20