Zookeeper In Simple Words

fujohnwang 4,658 views 20 slides Jan 11, 2011
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

overview things on zookeeper, no too much details inside


Slide Content

ZOOKEEPER IN MY WORDS
fujohnwang@Pluto
2011.01.11
Wednesday, January 12, 2011

WHAT’S ZK?
“ZooKeeper is a centralized service for maintaining configuration
information, naming, providing distributed synchronization, and providing
group services. All of these kinds of services are used in some form or another
by distributed applications. Each time they are implemented there is a lot of
work that goes into fixing the bugs and race conditions that are inevitable.
Because of the difficulty of implementing these kinds of services, applications
initially usually skimp on them ,which make them brittle in the presence of
change and difficult to manage. Even when done correctly, different
implementations of these services lead to management complexity when the
applications are deployed.”
Wednesday, January 12, 2011

WHAT THE FUCK
DOES THAT MEAN?
Simply Put: Coordination Service For Distributed
Systems
Wednesday, January 12, 2011

WHAT ZK CAN DO?
Name Service
Configuration
Group Membership
Distributed Synchronization
Wednesday, January 12, 2011

WHAT ZK LOOKS LIKE?
File System Model
Hierarchical Namespace
each node can have both data and children
Wednesday, January 12, 2011

HOW ZK WORKS?
Wednesday, January 12, 2011

HOW ZK WORKS?
Zab
Wednesday, January 12, 2011

HOW TO DEPLOY ZK?
single server
quorum cluster
majority algorithm(2n + 1)
Wednesday, January 12, 2011

ZK DEPLOYMENT IN DETAILS
configuration
data location
transaction log location (optional but necessary)
cluster members
myid file (under data location directory)
Wednesday, January 12, 2011

CONFIGURATION FILE
SAMPLE
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/Users/fujohnwang/Downloads/zookeeper-3.3.2/data
# the port at which the clients will connect
clientPort=2181
server.1=10.16.200.14:2888:3888
server.2=10.20.135.206:2888:3888
server.3=10.20.130.233:2888:3888
Wednesday, January 12, 2011

HOW TO INTERACTIVE WITH
ZK?
Command Line Tool
Zookeeper API Bindings
Java Bindings
C Bindings
Wednesday, January 12, 2011

ZOOKEEPER CLI
bin/zkCli.sh -server 127.0.0.1:2181
[zk: 10.20.130.233:2181(CONNECTED) 0] help
ZooKeeper -server host:port cmd args
!printwatches on|off
!ls2 path [watch]
!listquota path
!close
!get path [watch]
!sync path
!delete path [version]
!quit
!addauth scheme auth
!getAcl path
!setAcl path acl
!setquota -n|-b val path
!redo cmdno
!create [-s] [-e] path data acl
!set path data [version]
!stat path [watch]
!connect host:port
!ls path [watch]
!history
!delquota [-n|-b] path
Wednesday, January 12, 2011

ZK API = SIMPLE API
create
delete
exists
set data
get data
get children
sync
Wednesday, January 12, 2011

WHAT THE FEATURES OF ZK?
Notification
Ordering
Atomicity
Versioned Write
Ephemeral Nodes
Sequential Nodes
High Availability
Wednesday, January 12, 2011

WATCH - NOTIFICATION
Concept
Usecase
dynamic configuration push
other (semi)real-time information notification
Wednesday, January 12, 2011

EPHEMERAL NODES
Concept
lifecycle-dependent
no children
Usecase
group membership
leader election
Wednesday, January 12, 2011

SEQUENTIAL NODES
Concept
Usecase
queue
Wednesday, January 12, 2011

GUARANTEES OF ZK
Sequential Consistency - Updates from a client will be applied in the order that they
were sent.
Atomicity - Updates either succeed or fail. No partial results.
Single System Image - A client will see the same view of the service regardless of the
server that it connects to.
Reliability - Once an update has been applied, it will persist from that time forward until a client
overwrites the update.
Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time
bound.
Wednesday, January 12, 2011

CAUTIONS TO TAKE
prefer small data size for each node(less than 1M)
one time trigger (add again and again if u need)
dataset in all must fit in memory
Wednesday, January 12, 2011

nicholsong
QUESTIONS?
Wednesday, January 12, 2011