snowflake_databricks hoe to integrate snowflake with databricks
aianand98
6 views
8 slides
May 14, 2025
Slide 1 of 8
1
2
3
4
5
6
7
8
About This Presentation
SNow flake and data bricks integration
Size: 1.16 MB
Language: en
Added: May 14, 2025
Slides: 8 pages
Slide Content
Data Access Speed , Scalability Snowflake uses automatic parallel execution to fasten data loading for example it involves breaking up the single 10Gb file into 100 x 100Mb. Databricks provides fast performance when working with large datasets and tables as it uses Spark architecture underneath to parallelize and partition the data. Scalability : Snowflake automatically adds or resumes additional clusters (up to the maximum number defined by user) as soon as the workload increases. Databricks clusters spin-up and scale for processing massive amounts of data when needed and spin down when not in use.
As Close to Live System - multiple sysncs
Data Bricks Live streaming
Connection with 100+ SQL Databases across different servers
Ease of integration with ETL process Snowflake supports both transformation during (ETL) or after loading (ELT). Snowflake works with a wide range of data integration tools, including Informatica, Talend,Altryx and others. DataBricks has been build to support ETL and equipped with efficient tools to handle the data.
Ease of Expandability - after initial setup, schema change in source? How hard is to update the setup. Snow flake : A warehouse provides the required resources, such as CPU, memory, and temporary storage , to perform the following operations in a Snowflake session , It automatically scales its nodes and performance. Data Bricks: Databricks cluster is a set of computation resources and configurations on which you run data engineering and data analytics workloads . Where nodes and workers are manually assigned for different sizes of data.
Ci/Cd A CI/CD pipeline combines code building, testing, and deployment into one continuous process ensuring that all changes to the main branch code are releasable to production . An automated CI/CD pipeline prevents manual errors, provides standardized feedback loops to developers, and enables quick software iterations. Snow Flake : It uses Jenkins or other thierd party tools to create CI/CD pipelines for source code Databricks: It uses Azure Devops to create CI/CD pipelines.