Building a Replicated Logging System with Apache Kafka

GuozhangWang 3,710 views 78 slides Sep 07, 2015
Slide 1
Slide 1 of 78
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78

About This Presentation

Apache Kafka is a scalable publish-subscribe messaging system
with its core architecture as a distributed commit log.
It was originally built as its centralized event
pipelining platform for online data integration tasks. Over
the past years developing and operating Kafka, we extend
its log-structur...


Slide Content

Building a Replicated Logging System with Apache Kafka Guozhang Wang, Joel Koshy, Sriram Subramanian, Kartik Paramasivam Mammad Zadeh, Neha Narkhede, Jun Rao, Jay Kreps, Joe Stein Body Level Two Body Level Three Body Level Four Body Level Five

We All Love Logs!

Apache Kafka A distributed messaging system ..that store messages as a log!

Example: LinkedIn back in 2010 Point-to-Point Pipelines What We Want: A Centralized Data Pipeline

Log-centric Data Flow Logical Ordering Persistent Buffering “ Source-of-Truth”

Store Messages as a Log 4 5 5 7 8 9 10 11 12 ... Producer Write Consumer1 Reads (offset 7) Consumer2 Reads (offset 10) Messages 3

Partition the Log across Machines Topic 1 Topic 2 Partitions Producers Producers Consumers Consumers Brokers

Apache Kafka Example: Kafka at LinkedIn

“Source-of-Truth” should not be lost even when..

Replicas and Layout Logs Broker-1 topic1-part1 topic1-part3 topic1-part2 Logs topic1-part2 topic1-part1 topic1-part3 Logs topic1-part3 topic1-part2 topic1-part1 Broker-2 Broker-3

Consensus for Log Replication Logs Broker-1 Logs Logs Broker-2 Broker-3 Write Consensus Protocol Consensus Protocol

Key Idea Separate membership configuration from data replication

Primary-backup Replication Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write

Conventional Quorum Commits Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write

Conventional Quorum Commits Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write

Conventional Quorum Commits Logs Broker-1 * Logs Logs Broker-2 Broker-3

Conventional Quorum Commits Logs Broker-1 * Logs Logs Broker-2 Broker-3

Leader maintains in-sync-replicas (ISR) Failed / slow follower => drop from ISR Caught-up follower => re-join ISR Producer specifies required ACK based on ISR Configurable ISR Commits

Example: ACK with all ISRs Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: ACK with all ISRs Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: ACK with all ISRs Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: ACK with all ISRs Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: ACK with all ISRs Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: ACK with all ISRs Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: ACK with all ISRs Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: ACK with Leader-only Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“leader”) ISR { 1, 2, 3 }

Example: ACK with Leader-only Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“leader”) ISR { 1, 2, 3 }

Example: ACK with Leader-only Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“leader”) ISR { 1, 2, 3 }

Example: ACK with Leader-only Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“leader”) ISR { 1, 2, 3 }

Example: ACK with Leader-only Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“leader”) ISR { 1, 2, 3 }

Example: ACK with Leader-only Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“leader”) ISR { 1, 2, 3 }

Example: Slow Follower Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: Slow Follower Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: Slow Follower Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: Slow Follower Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: Slow Follower Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2, 3 }

Example: Slow Follower Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2 }

Example: Slow Follower Logs Broker-1 * Logs Logs Broker-2 Broker-3 Write (ack=“all”) ISR { 1, 2 }

Configurable ISR Commits ACK mode Latency On Failures “no" no network delay some data loss “leader" 1 network roundtrip a few data loss “all" ~2 network roundtrips no data loss

Use an embedded controller Detect broker failure via ZooKeeper Leader failure => elect new leader from ISR Leader and ISR persisted in Zookeeper For Controller fail-over Membership Management

Example: Broker Failure Logs Broker-1 * Logs Logs Broker-2 Broker-3 ISR {1, 2}

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 Broker-3

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 Broker-3

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 Broker-3

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 Broker-3 ISR {2 }

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 Broker-3 ISR { 2 }

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 Broker-3 ISR { 2 }

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 * Broker-3 ISR {2}

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 * Broker-3 ISR {2}

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 * Broker-3 ISR {2}

Example: Broker Failure Logs Broker-1 Logs Logs Broker-2 * Broker-3 ISR {2, 3}

Overview: Logs and Kafka Log Replication in Kafka Kafka Usage at LinkedIn Conclusion Agenda

Change Log Replication

Apache Kafka Example: Kafka at LinkedIn

Example: Espresso A distributed document store Primary online data serving platform at LI Member profile, homepage, InMail, etc [SIGMOD 2013]

Old Espresso Replication Data Center-1 Storage Node Storage Node MySQL Replication MySQL MySQL Search Index Hadoop … … Databus Cross-DC Replicator Data Center-1 Storage Node Storage Node MySQL Replication MySQL MySQL Search Index Hadoop … Databus Cross-DC Replicator

Problems with MySQL Replication Master Storage Node P1 Slave Storage Node P2 P3 P4 P5 P6 P1 P2 P3 P4 P5 P6 Binary Log Shipping

Replicate Logs with Kafka Storage Node Kafka Logs P1 Storage Node P2 P3 P4 P5 P6 P1 P2 P3 P4 P5 P6 Kafka Producer Kafka Consumer Kafka Consumer Kafka Producer

Key-based Log Compaction ... Partition Messages Segment-3 Segment-4 Segment-6 *

Key-based Log Compaction d: 3 f: 8 b: 0 c: null ... Partition Messages c: 3 a: 5 a: 6 a: 5 f: 9 ... Segment-3 Segment-4 b: 2 d: 4 a: 1

Key-based Log Compaction ... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ... Segment-3 Segment-4 c: 3 a: 5 a: 6 b: 2 d: 4 a: 1 c: 3 a: 5 a: 6 b: 2 d: 4 a: 1 d: 3 f: 8 b: 0 a: 5 f: 9 New Segment Partition Messages

Key-based Log Compaction ... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ... Segment-3 Segment-4 c: 3 a: 5 a: 6 b: 2 d: 4 a: 1 c: 3 a: 6 d: 3 f: 8 b: 0 c: null a: 5 f: 9 New Segment Partition Messages

Key-based Log Compaction ... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ... Segment-3 Segment-4 c: 3 a: 5 a: 6 b: 2 d: 4 a: 1 d: 3 b: 0 a: 5 f: 9 New Segment Partition Messages

Key-based Log Compaction ... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ... Segment-3 Segment-4 c: 3 a: 5 a: 6 b: 2 d: 4 a: 1 d: 3 b: 0 a: 5 f: 9 New Segment Partition Messages

New Espresso Replication Data Center-1 Storage Node Storage Node Storage Node Kafka Logs MySQL MySQL MySQL Data Center-n Storage Node Storage Node Storage Node Kafka Logs MySQL MySQL MySQL Kafka MirrorMaker Search Index Hadoop … … Search Index Hadoop … * In Progress

Stream Processing

Apache Kafka Example: Kafka at LinkedIn

Data flow streaming on Kafka and YARN Stateful processing Re-processing Failure Recovery Example: Samza [CIDR 2015]

Kafka Kafka Samza State Process Protocol State Process Protocol State Process Protocol Samza Processing

Kafka Kafka Samza State Process Protocol State Process Protocol State Process Protocol Samza Processing Kafka Changelog

Kafka Kafka Samza State Process Protocol State Process Protocol State Process Protocol Samza Processing Kafka Changlog

Kafka Kafka Samza State Process Protocol State Process Protocol State Process Protocol Samza Processing Kafka Changlog State Process Protocol

Take-aways Log-centric data flow helps scaling your systems Kafka: replicated log streams for real-time platforms

We are Hiring

Take-aways Log-centric data flow helps scaling your systems Kafka: replicated log streams for real-time platforms THANKS!