Apache Samza

mikejf12 81 views 13 slides May 26, 2020

Slide 1 of 13

About This Presentation

This presentation gives an overview of the Apache Samza project. It explains Samza's stream processing capabilities as well as its architecture, users, use cases etc.

Links for further information and connecting

http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/

https://nz.linkedin.com/pu...

Size: 510.29 KB

Language: en

Added: May 26, 2020

Slides: 13 pages

Slide Content

What Is Apache Samza ?
●An asynchronous computational framework
●For distributed sub second stream processing
●Fault tolerance, isolation and stateful processing
●Open source / Apache 2.0 license
●Developed in Java and Scala
●Runs stand-alone or on YARN

Samza Use Cases
●Applications that require millisecond - second response
–Streaming analytics
–DDOS attack detection
–Fraud detection
–Metric anomaly detection
–System notifications
–Performance monitoring

Samza Users

Samza Partitioned Stream
●Samza uses streams to process data
●Collections of ordered immutable objects
●Each object uses a key-value pair
●Each stream is sharded into partitions
●This allows the architecture to scale

Samza API's
●High Level Streams API (Java)
–Stream based processing API
●Low Level Task API (Java)
–Message based processing API
●Table API
–Random access by key data sources
●Testing Samza
–Samza's testing Integration framework
●Samza SQL
–Stream processing via SQL and UDF's
●Apache BEAM
–Samza provides a Beam runner for application execution

Samza Architecture

Samza Architecture
●Application are broken down into tasks
●Each task consumes data from a stream partition
●Tasks are executed with containers
●A coordinator assigns tasks to containers
●Tasks checkpoint their last processed task offset
●Each task has its own state store for state management
●Samza replicates changes to local store in separate stream
●This allows later recovery of local stores

Samza Architecture
●Task container coordination

Samza Architecture
●Fault tolerance of state

Samza Architecture
●Incremental checkpointing

Samza Architecture
●State management

Available Books
● See “Big Data Made Easy”
–Apress Jan 2015
●
See “Mastering Apache Spark”
–Packt Oct 2015
●
See “Complete Guide to Open Source Big Data Stack
–“Apress Jan 2018”
● Find the author on Amazon
–www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
●
Connect on LinkedIn
–www.linkedin.com/in/mike-frampton-38563020

Connect
● Feel free to connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
● See my open source blog at
–open-source-systems.blogspot.com/
● I am always interested in
–New technology
–Opportunities
–Technology based issues
–Big data integration

Apache Samza

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Apache Samza

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......