Choosing between Fabric, Synapse and Databricks (Data Left Unattended 2023)

CathrineWilhelmsen 407 views 52 slides Apr 01, 2024
Slide 1
Slide 1 of 52
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52

About This Presentation

Choosing between Microsoft Fabric, Azure Synapse Analytics and Azure Databricks (Presented at Data Left Unattended on December 7th, 2023)


Slide Content

Hook 1
You are a
data engineer

Hook 2
You are starting
a new project

Hook 3
Which tool
do you use?

Hook 4
Azure Databricks?
Azure Synapse Analytics?
Microsoft Fabric?

Hook 5
It Depends ™

Hook 6
What does it
depend on?

Hook 7
Let's take
a look!

Choosing between
Azure Databricks,
Azure Synapse Analytics
and Microsoft Fabric
MartheMoengen, Emilie Rønning, Cathrine Wilhelmsen

Cathrine
Wilhelmsen
Evidi
[email protected]
cathrinew.net
@cathrinew
Marthe
Moengen
Sopra Steria
Emilie
Rønning
Quarks
[email protected]
data-ascend.com
@mmoengen
[email protected]
linkedin.com/in/emilie-ronning
@emilieronning
Solutions Architect Managing Data AnalystSenior Data Engineer
The GOAT in Norway Beyoncé in Tech SnæckQueen

Azure
Databricks

What is Azure Databricks?
“Azure Databricks is a unified, open analytics platform for
building, deploying, sharing, and maintaining enterprise-
grade data, analytics, and AI solutions at scale”
Data SharingData WarehousingData EngineeringArtificialIntelligence
Data Science MarketplaceReal-time StreamingData Governance

What is Azure Databricks?
“Easily ingest and transform batch and streaming data on
the Databricks Lakehouse Platform.”
Use the lakehouse platform to store your
structured, semi-structured and unstructured
data
Write your data transformation using notebooks
as code in four pre-defined languages.
Orchestrate your workflows using low-code/ no-
code workflows
Data Engineering

What is Azure Databricks?
Data Warehousing
“Databricks SQL is a SQL editor and dashboarding tools,
allowing team members to collaborate with other Databricks
users directly in the workspace.”
Ad-hoc Querying
Dashboards
Scheduled Querying
Alerts based on Query output

Azure
Synapse
Analytics

What is Azure Synapse Analytics?
Unified analytics platform:
•Data Integration
•Data Lake
•Data Warehousing
•Big Data Analytics
•Time-Series Analytics
•Data Science

What is Azure Synapse Analytics?
Three engines:
•SQL Pools
Dedicated, Serverless
•Spark
•Data Explorer

What is Azure Synapse Analytics?
Pick and choose which
components to use:
•Enterprise-scale
withabigbudget?
•Single / simple solution
with a limited budget?

DedicatedSQL(Big budget)
Massive Parallel
Processing (MPP)
Data Warehousing:
•Distribution
•Partitioning
•Indexing
•Workloads

Serverless SQL(Limited budget)
Query files using SQL and
create virtual databases:
•CSV Files
•Parquet Files
•JSON Files

Synapse Overview

Microsoft
Fabric

What is Microsoft Fabric?
AS IS before
Fabric
Landing Zone Refined Zone Consumption
Zone
Storage Storage Storage
orchestration
compute
VM
AAS
transformations
ingestion

What is Microsoft Fabric?
Landing Zone Refined Zone Consumption
Zone
Storage
Storage Storage
orchestration
compute
VM
AAS
transformations
ingestion
One Lake
AS IS before
Fabric

What is Microsoft Fabric?
Storage
Storage StorageOne Lake
Data
Engineering
Data
Factory
Data
Ware-
housing
Real
Time
Analytics
Data
Science
Business
Intelligence
Data
Activator
AS IS before
Fabric

What is Microsoft Fabric?
Storage
Storage StorageOne Lake
Data
Engineering
Data
Factory
Data
Ware-
housing
Real
Time
Analytics
Data
Science
Business
Intelligence
Data
Activator
One Security
Governance and Administration

What is Microsoft Fabric?
Storage
Storage StorageOne Lake
Data
Engineering
Data
Factory
Data
Ware-
housing
Real
Time
Analytics
Data
Science
Business
Intelligence
Data
Activator
One Security
Governance and Administration
AI Assisted AI-Assisted

What is Microsoft Fabric?
Storage
Storage StorageOne Lake
Data
Engineering
Data
Factory
Data
Ware-
housing
Real
Time
Analytics
Data
Science
Business
Intelligence
Data
Activator
One Security
Governance and Administration
AI Assisted AI-Assisted
Software as a Service
(SaaS)
All-in-one analytics
solution
•Data movement
•Data storage
•Data science
•Real-Time Analytics
•Business intelligence
•Data Governance
•Data Security
•Data Warehouse
•Data Lakehouse
•Semantic model

Let’scompare!

Architectures
Where do the services shine?

Data Warehouse
“Centralized repository of structured
and often historical data, enabling
businesses to analyze and make
informed decisions.”

Data Warehouse
Data Warehouse

Data Warehouse

Data Lakehouse
“Combines the benefits of data lakes
and data warehouses, providing a
unified platform for storing, processing,
and analyzing data at scale.”

Data Lakehouse
Bronze Layer Silver Layer Gold Layer

Data Lakehouse

Data Mesh
“Architectural pattern for
implementing enterprise data
platforms in large and complex
organizations.”

Data Mesh
Self-Serve Data Platform
Federated Governance
Domain A
Domain B
Domain C
Domain E
Product
1
Product
2
Product
3
Product
4
Product
6
Product
5
Product
7

Data Mesh

????????????????????????

Decisionfactors
How do youchoose?

Whatkindofarchitecture?
Data Warehouse
Data Lakehouse
Data Mesh

Who aretheusers?
Business Users
Power Users
IT Professionals

Whataretheirexistingskills?
No-code/ Low-code
SQL
Python

Pricing
How muchwillit cost?

Azure Databricks
Pay-as-you-go:
Engineering Compute* (Standard/Premium)
•All-Purpose ??????????????????
•Job ????????????
•Job Light ??????
Warehousing Compute* (Premium)
•SQL ??????
•SQL Pro ????????????
•Serverless ??????????????????
❗In addition to VMs, Azure Databricks will also bill
for managed, disk, blob storage, Public IP Address.
*1-year and 3-year pre-purchaseplans available!

Azure Synapse Analytics
Pay-as-you-go:
•Data Integration
•Spark Pools
•Data Explorer Pools
•Dedicated SQL Pools
•Serverless SQL Pools
Pre-purchaseplans available!

Microsoft Fabric
Azure SKUs
•Pay-per-second
•Scale / Pause / Resume
Microsoft 365 SKUs
•Pay monthly or yearly
(monthly commitment)

Whichonedo youchoose?
Data Intelligence
Platform
Analytics Platform
From a Technical Perspective
All-in-one Service
From a Business Perspective

Whichonedo youchoose?

Want to learn more about Fabric?
www.fabricfebruary.com
JoinusFebruary8th, 2024 • Oslo, Norway
Fabric February 2024

Thank you!
MartheMoengen, Emilie Rønning, Cathrine Wilhelmsen