Zookeeper

crashclappergapSoftware and s/w Development

Dec 13, 2013 (3 years and 7 months ago)

76 views

Zookeeper

Roy Campbell

Motivation


Centralized
service


Maintains


configuration
information,


naming
,


distributed
synchronization,


group services.


Avoids Synchronization and Races

Zookeeper Properties


Wait
-
free


Per Client guarantee of FIFO execution of
requests


Linearizability

for all requests that change the
Zookeeper state


Built using ZAB, a totally ordered broadcast
protocol (based on
Paxos
)

What is ZooKeeper?

A highly available, scalable, distributed,
configuration, consensus, group membership,
leader election, naming, and coordination
service

Why use ZooKeeper?


Difficulty of implementing these kinds of services
reliably


brittle in the presence of change


difficult to manage


different implementations lead to management
complexity when the applications are deployed


Visualizing
Paxos

CS5412 Spring 2012 (Cloud Computing:
Birman)

6


The proposer requests that the Paxos system accept
some command. Paxos is like a “postal system”


It thinks about the letter for a while (replicating the
data and picking a delivery order)


Once these are “decided” the learners can execute the
command

R1

R2

R3

learners

proposer

coordinato
r

Acceptor

Acceptor

Acceptor

7

Paxos

In
Failure
-
Free Synchronous Runs

1

1

2

n

.

.

.

(“accept”,

1,1


,
v
1
)

1

2

n

.

.

.

1

1

2

n

.

.

.

(“prepare”,

1,1

)

(“ack”,

1,1

,

0,0

,
^
)

decide
v
1

(“accept”,

1,1


,
v
1
)

Simple Paxos implementation

always trusts process 1

What is ZooKeeper again?


File
api

without partial reads/writes


Simple wait free data objects organized hierarchically as in
file systems.


No renames


Ordered updates and strong persistence guarantees


Conditional updates (version)


Watches for data changes


Ephemeral nodes


Generated file names

Any Guarantees?

1.
Clients will never detect old data.

2.
Clients will get notified of a change to data they are
watching within a bounded period of time.

3.
All requests from a client will be processed in order.

4.
All results received by a client will be consistent
with results received by all other clients.


Data Model


Hierarchical namespace


Each
znode

has data and
children


data is read and written in
its entirety


/

services

users

apps

locks

servers

YaView

read
-
1

morestupidity

stupidname

ZooKeeper API

String create(path, data,
acl
, flags)



void delete(path,
expectedVersion
)



Stat
setData
(path, data,
expectedVersion
)



(data, Stat)
getData
(path, watch)



Stat exists(path, watch)



String[]
getChildren
(path, watch)



void sync(path)


(watch events


complete with updated
information.)

ZooKeeper Service


All servers store a copy of the data (in memory)



A leader is elected at startup


Followers service clients, all updates go through leader


Update responses are sent when a majority of servers have persisted the change


ZooKeeper Service






Server

Server

Server

Server

Server

Server

Leader

Client

Client

Client

Client

Client

Client

Client

Use cases inside of Yahoo!

»
Leader Election

»
Group Membership

»
Work Queues

»
Configuration Management

»
Cluster Management

»
Load Balancing

»
Sharding




Use of ZooKeeper in HBase



Leader Election


Ensure there is at most 1 active master at any time



Configuration Management


Store the bootstrap location



Group Membership


Discover tablet servers and finalize tablet server
death


Leader Election

1
getdata(“/servers/leader”, true)

2
if successful follow the leader described in the data and exit

3
create(“/servers/leader”, hostname, EPHEMERAL)

4
if successful lead and exit

5
goto step 1

Leader Election in Perl

my $zkh = Net::ZooKeeper
-
>new(‘localhost:7000’);

my $req_path = “/app/leader”;

$path = $zkh
-
>get($req_path, ‘stat’=> $stat, ‘watch’=>$watch);

if (defined $path) {


#someone else is the leader


#parse the string path that contains the leader address

} else {


$path = $zkh
-
>create($req_path, “hostname:info”, 'flags' => ZOO_EPHEMERAL,



'acl' => ZOO_OPEN_ACL_UNSAFE) ;


if (defined $path) {



#we are the leader, continue leading


} else {



$path = $zkh
-
>get($req_path, ‘stat’=> $stat, ‘watch’=>$watch);



#someone else is the leader



# parse the string path that contains the leader address


}


}


Leader Election in Python

handle = zookeeper.init("localhost:2181", my_connection_watcher, 10000, 0)

(data, stat) = zookeeper.get(handle, “/app/leader”, True);

if (stat == None)



path = zookeeper.create(handle, “/app/leader”, hostname:info,



[ZOO_OPEN_ACL_UNSAFE], zookeeper.EPHEMERAL)



if (path == None)




(data, stat) = zookeeper.get(handle, “/app/leader”, True)




#someone else is the leader




# parse the string path that contains the leader address



else




# we are the leader continue leading

else


#someone else is the leader


#parse the string path that contains the leader address









Performance Numbers.

Where are we?


Multi Tenant (Quotas, connection
management, chroot)


Recipes


Reusuable code libraries


Bindings



Java, C, Perl, Python, REST



Where are we?


BookKeeper (a contrib project)


System to reliably log streams of records


Ongoing work to use BookKeeper optionally as
edits log in the NameNode (Hadoop
-
5189
)


Using BookKeeper and ZooKeeper for a pub sub
system

Q&A


Questions?








Links:

http://hadoop.apache.org/zookeeper/