Kylo is an open source enterprise-ready data lake management software platform for self-service data ingest and data preparation with integrated metadata
Size: 1 MB
Language: en
Added: May 07, 2024
Slides: 9 pages
Slide Content
Apache Kylo 2022-06-24 15:00 (UTC+5) Recorded by Organized by Ibrar Ahmed Rida Zahid
Apache Kylo Ibrar Ahmed Staff Software Engineer
Agenda
Apache Kylo Kylo is an opensource enterprise-ready data lake management software platform for self-service data ingest and data preparation with integrated governance, security management and best practices inspired by Think Big's 150+ big data implementation projects. Kylo Service components :
Apache Kylo Key Features: Ingest: Self-service data ingest with data cleansing, validation, and automatic profiling Prepare: Wrangle data with visual sql and an interactive transform through a simple user interface. Discover: Search and explore data and metadata, view lineage, and profile statistics. Monitor: Monitor health of feeds and services in the data lake. Track SLAs and troubleshoot performance. Design: Design batch or streaming pipeline templates in Apache NiFi and register with Kylo to enable user self-service.
Apache Kylo
Challenges faced by a Platform Team - Minimize data engineer's efforts - Provided solution must be simple/easy to use - Shortage of experienced software engineers and administrators - Solution must be self service