introduction and use of apache_kylo.pptx

ChIbrarAhmed1 14 views 9 slides May 07, 2024
Slide 1
Slide 1 of 9
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9

About This Presentation

Kylo is an open source enterprise-ready data lake management software platform for self-service data ingest and data preparation with integrated metadata


Slide Content

Apache Kylo 2022-06-24  15:00 (UTC+5) Recorded by                                              Organized by Ibrar Ahmed                                  Rida Zahid

Apache Kylo  Ibrar Ahmed   Staff Software Engineer

Agenda

Apache Kylo Kylo is an opensource enterprise-ready data lake management software platform for self-service data ingest and data preparation with integrated governance, security management and best practices inspired by  Think Big's  150+ big data implementation projects. Kylo Service components :

Apache Kylo Key Features: Ingest:           Self-service data ingest with data cleansing, validation, and automatic profiling Prepare:                Wrangle data with visual sql and an interactive transform through a simple user interface. Discover:                   Search and explore data and metadata, view lineage, and profile statistics. Monitor:                Monitor health of feeds and services in the data lake. Track SLAs and troubleshoot performance. Design:                  Design batch or streaming pipeline templates in Apache NiFi and register with Kylo to enable user self-service.

Apache Kylo

Challenges faced by a Platform Team - Minimize data engineer's efforts - Provided solution must be simple/easy to use -  Shortage of experienced software engineers and administrators -   Solution must be self service

PowerSchool  Analytics Platform Architecture

Apache Kylin Demo/Training