OpenMetadata Spotlight - OpenMetadata @ Loggi by Erica Bertan

openmetadatacollate 99 views 21 slides Sep 19, 2024
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

The OpenMetadata Community Meeting was held on September 18th, 2024. In the Community Spotlight, Erica Beratn from Loggi (https://www.loggi.com.br/en/) explains us their data governance best practices at Loggi, a cloud-native logistics company. Discover how their OpenMetadata adoption led to signifi...


Slide Content

1
September 2024 / Data Management
Erica Bertan
How Open Metadata
helps us in the Data
Governance at Loggi

8+ years of experience in data and software engineering teams

Software Engineering: microservices, unit tests, spark, docker

Data: analytics, visualization tools, data modelling, data quality

Strategy, communication and leadership


Erica Bertan
Analytics Engineering Manager

about Loggi

Brazilian logistic company

About Loggi
● 10+ years of experience delivering packages
● 300,000 of packages / day
● Hubs and distribution centers in the 27 federal states of the
country
● US$ 1 billion of investments in 7 years (SoftBank, Microsoft, GGV
Capital, Monashees, Kaszkek and others)
● 2019: a brazilian unicorn

Challenges
● Continental country, several modus operandi to delivery
package
● Complex logistic chain
● More than 2 thousand of workers spread across the country
● Every federal state with particularities

●how we are doing Data Governance at Loggi at the
moment

●what works for us

●how Open Metadata fits in the Data Governance
marathon



Goals
●deep dive about tools

●technical aspects of our infrastructure

●the pros and cons of the stack

Not goals

the problems

Problem 1: Communication and
definition of responsibilities

"Who can I ask about the business context of this model?"

Problem 2: Data organization
"Where can I find the correct data in order to produce insights?"

"How can I even start?"

Problem 3: Data reporting
inconsistencies
"Which metric is the correct one?"

● 18,000 of dashboards and looks
● 50 looker models

Problem 4: Complex
structures
"Which table should I use?"

● package_events versus package_register

data: our big numbers

~100
ETL Jobs

The midnight job processes
almost 500 hundred tables in 8
hours
data: our big numbers


776
Looker users


42TB DL
100TB DW
~1,8k
Looker Dashboards

200 GB
new data
daily volume

9 million
new records of
package's tracking/day
2,5 hours/day
Average daily usage

9,4k
tables

Storage

how Open Metadata fits?

how Open Metadata fits?
Definition of Ownership

"Who can I ask about the business context of this model?"

how Open Metadata fits?
Data Lineage

"What's the impact of this model new release?"

how Open Metadata fits?
Deletion of dated dashboards

"This dashboard is not used anymore" - from 18 thousand to 1,5 thousand
The process occurred in the following sequence:
1.We listed all the dashboards.
2.We informed the company.
3.People marked what they used.
4.We deleted everything that wasn't marked.

how Open Metadata fits?
Catalog

"Whats the meaning of this table/column table?"

how Open Metadata fits?
Data Quality

"Can we add robustness to these models?"

how Open Metadata fits?
Alerts

Proactiveness and observability building trust

timeline
Jun-Dec
2023
Cleaning the house
✅ Ownerships
✅model refactoring: midnight
job 30% faster
✅deletion of unused/dated
dashboards: from 18 -> 1.5
thousand

Jan-Jun
2024


Gold metrics and data
quality
✅building trust: developing
17+ test cases of data quality
on top of important models
✅catalog: documenting 250+
of our data sources

Jun-Dec
2024
Governance
✅ Deletion of unused/dated
tables: recovery of U$D
2000/month
?????? Organization of models
and permissions on Looker
?????? Organization: ownerships,
metadata, data quality

Jan-Jun
2025
?????? work in progress

Thank you!
Obrigada!
[email protected]
loggi.com
21