In this engaging 20-minute talk, we'll explore how modern cloud architectures and advanced data catalog tools can revolutionize the way technical and business units collaborate. Participants will gain valuable insights into leveraging cutting-edge technologies to enhance data-driven decision-mak...
In this engaging 20-minute talk, we'll explore how modern cloud architectures and advanced data catalog tools can revolutionize the way technical and business units collaborate. Participants will gain valuable insights into leveraging cutting-edge technologies to enhance data-driven decision-making and operational efficiency. Key Topics: Introduction to modern cloud architectures and their impact on data management, Understanding the role of data catalogs in organizing and democratizing data assets, Deep dive into dbt (Data Build Tool) and its data transformation capabilities, Exploring advanced data catalog features for data governance and collaboration, Practical strategies for implementing these tools in your organization. This talk is designed for data professionals, business analysts, and decision-makers who want to stay ahead in the rapidly evolving world of data science. Attendees will leave with actionable insights on how to bridge the gap between technical expertise and business acumen, ultimately driving innovation and growth in their organization
Size: 13.7 MB
Language: en
Added: Sep 21, 2024
Slides: 25 pages
Slide Content
Bridging the Technical-Business Divide with Modern Cloud Architectures and Data Catalogs www.scaletechplatforms.com DSC DACH 2024
Table of About 01 02 03 04 Modern technologies Problem definition Data catalogs Snowflake 05 06 07 08 dbt Atlan Summary CONTENTS
Presenting Boris Perkovic CEO & Solution Architect Professional Expertise: Designing and optimising data pipelines. Leading cloud-based architecture projects to enhance data accessibility and insights. Skills Snapshot: Technologies: Snowflake, AWS, dbt, Airflow, Terraform Programming: SQL, Python Specialties: Cloud Services, Data Warehouse, ETL & Orchestration, Data Governance
ABOUT ScaleTech Platforms brings together specialised skills in cloud architecture, data engineering, and analytics. Cloud Services C loud engineering and data services Focusing on optimising data infrastructure and leveraging advanced cloud technologies Core Expertise D ata infrastructure optimisation Cloud architecture design Development of analytical applications to drive business insights Key Technologies Snowflake AWS services dbt Labs
TECHNOLOGIES Rapid adoption and growth Emergence of advanced services Focus on security and compliance Modern Rise of multi-cloud and hybrid cloud strategies Shift towards serverless and edge computing
Problem Tech - Business Gap data structure != business terms data understanding Data silos limited data access inconsistent data terminology inconsistent reporting Organizational Efficiency increased load on tech teams long time to get insights on business teams DEFINITION
Standardised definitions Cross-functional alignment Contextual information Hierarchical relationships Common language and terminology User-friendly search Metadata-rich results Data previews Recommendation Self-service data discovery CATALOGS Data
Origin tracking Impact analysis Quality metrics Usage statistics Data Context and Lineage Commenting and discussions Tagging and annotations Rating and reviews Knowledge sharing Collaboration Features CATALOGS Data
Advanced Comprehensive support for diverse data ecosystems Advanced AI/ML capabilities for metadata management and data discovery Robust data lineage, impact analysis, and governance features High level of customisation and scalability Integration with broader data management and analytics tools Suitable for large enterprises with complex data landscapes Intermediate Support for diverse data sources and platforms Enhanced metadata management and data discovery features Basic data lineage and governance capabilities Moderate level of customisation and scalability Suitable for medium-sized organisations or those beginning to scale their data operations Basic Platform native Limited to a single platform or ecosystem Basic metadata management and search capabilities Often included as part of a larger data management suite Typically easier to set up and use within the native environment Types
Cloud-native architecture public cloud hosted multi-cloud support seamless integration with other cloud services automatic updates and maintenance global availability and data residency options Cloud-based data platform that provides a unified solution for storing, processing, and analysing large volumes of structured and semi-structured data. It architecture separates compute and storage, allowing for independent scaling of resources, while offering features like instant elasticity, secure data sharing, and support for diverse workloads including data warehousing, data lakes, and data science. Snowflake Optimised storage adaptive caching micro-partitions auto clustering time travel zero copy cloning Elastic multi-cluster compute super-sized virtual warehouses (compute) scaling up and out search optimisation service SQL, Python, Java Advanced features out of the box data governance advanced security options native ML capabilities containerisation Jupiter notebooks support
Snowflake
Snowflake
Centralised metric definitions consistency across the organisation reduced redundancy (DRY) Simplifies maintenance and updates of metric logic Improved data governance with single source of truth for metrics SQL-based transformations easy acceptance due SQL based transformations complex data transformations within data warehouse integrates easily with modern data warehouses Integrated testing and version control write and run tests on your transformations version control of SQL transformations collaboration by tracking changes and allowing for code reviews reliability and reproducibility Enhanced collaboration shared environment for modelling and transformation documentation features code reuse through modular design efficiency by standardising processes A software that enables data teams to transform data in their warehouses using SQL-based models. It adds a Semantic Layer for centralised metric definitions. Semantic Layer allows teams to define, document, and version control their key business metrics alongside their data models. It brings consistency across analytics tools and promotes unified understanding of data. dbt Labs
dbt Labs
dbt Labs
dbt Labs
dbt Labs
Intuitive data discovery and search Natural language search AI-powered recommendations Rich metadata Visual data catalog Active metadata management Real-time insights Proactive alerts Automated metadata collection Metadata versioning End-to-end data lineage Column-level lineage Cross-tool lineage Impact analysis Lineage visualization Embedded collaboration In-platform discussions Data asset sharing knowledge management Role-based access control A modern data catalog and governance platform that centralises the data management. It enables intuitive data discovery and automated metadata management. It features end-to-end data lineage and collaborative tools, helping teams to find, understand, and trust their data effectively. Atlan
Atlan
Atlan
Atlan
Atlan
Atlan
Main Focus Hosting Data catalog maturity Snowflake Data platform SaaS Basic dbt Transformation layer SaaS/self-hosting Intermediate Atlan Catalog SaaS High CONCLUSION
CONTACT www.scaletechplatforms.com [email protected] https://www.linkedin.com/company/scaletechplatforms Vienna, Austria THANK YOU! Q&A