snowflake_databricks hoe to integrate snowflake with databricks

aianand98 6 views 8 slides May 14, 2025
Slide 1
Slide 1 of 8
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8

About This Presentation

SNow flake and data bricks integration


Slide Content

Data Access Speed , Scalability Snowflake uses automatic parallel execution to fasten data loading for example it involves breaking up the single 10Gb file into 100 x 100Mb. Databricks provides fast performance when working with large datasets and tables as it uses Spark architecture underneath to parallelize and partition the data. Scalability : Snowflake automatically adds or resumes additional clusters (up to the maximum number defined by user) as soon as the workload increases. Databricks clusters spin-up and scale for processing massive amounts of data when needed and spin down when not in use.

As Close to Live System - multiple sysncs

Data Bricks Live streaming

Connection with 100+ SQL Databases across different servers

Ease of integration with ETL process Snowflake supports both transformation during (ETL) or after loading (ELT). Snowflake works with a wide range of data integration tools, including Informatica, Talend,Altryx and others. DataBricks has been build to support ETL and equipped with efficient tools to handle the data.

Ease of Expandability - after initial setup, schema change in source? How hard is to update the setup. Snow flake : A warehouse  provides the required resources, such as CPU, memory, and temporary storage , to perform the following operations in a Snowflake session , It automatically scales its nodes and performance. Data Bricks:   Databricks cluster is  a set of computation resources and configurations on which you run data engineering and data analytics workloads . Where nodes and workers are manually assigned for different sizes of data.

Ci/Cd A CI/CD pipeline  combines code building, testing, and deployment into one continuous process ensuring that all changes to the main branch code are releasable to production . An automated CI/CD pipeline prevents manual errors, provides standardized feedback loops to developers, and enables quick software iterations. Snow Flake : It uses Jenkins or other thierd party tools to create CI/CD pipelines for source code Databricks: It uses Azure Devops to create CI/CD pipelines.
Tags