Distributed Hash Table

payberah 3,060 views 77 slides Jan 26, 2015
Slide 1
Slide 1 of 77
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77

About This Presentation

Distributed Hash Table


Slide Content

Distributed Hash Tables
Amir H. Payberah
[email protected]
Amirkabir University of Technology
(Tehran Polytechnic)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 1 / 62

What is the Problem?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 2 / 62

What is the Problem?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 3 / 62

Possible Solutions (1/3)
ICentral directory
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 4 / 62

Possible Solutions (2/3)
IFlooding
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 5 / 62

Possible Solutions (3/3)
IDistributed Hash Table (DHT)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 6 / 62

Distributed Hash Table
(DHT)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 7 / 62

Distributed Hash Table
IAn ordinary, which is ...
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 8 / 62

Distributed Hash Table
IAn ordinary, which is.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 9 / 62

Steps to Build a DHT
IStep 1: decide on.
IStep 2:IStep 3: make a strategy for.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 10 / 62

Steps to Build a DHT
IStep 1: decide on.
IStep 2:IStep 3: make a strategy for.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 10 / 62

Steps to Build a DHT
IStep 1: decide on.
IStep 2:IStep 3: make a strategy for.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 10 / 62

Chord: an Example of a DHT
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 11 / 62

Construct Chord - Step 1
IUse a, called the, consisting of identiers
f0;1;2; ; N1g.
IId space is a N.IEvery node picks a random id though.
IExample:
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(a) = 6
H(b) = 5
H(c) = 0
H(d) = 11
H(e) = 2
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 12 / 62

Construct Chord - Step 1
IUse a, called the, consisting of identiers
f0;1;2; ; N1g.
IId space is a N.IEvery node picks a random id though.
IExample:
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(a) = 6
H(b) = 5
H(c) = 0
H(d) = 11
H(e) = 2
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 12 / 62

Construct Chord - Step 1
IUse a, called the, consisting of identiers
f0;1;2; ; N1g.
IId space is a N.IEvery node picks a random id though.
IExample:
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(a) = 6
H(b) = 5
H(c) = 0
H(d) = 11
H(e) = 2
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 12 / 62

Construct Chord - Step 1
IUse a, called the, consisting of identiers
f0;1;2; ; N1g.
IId space is a N.IEvery node picks a random id though.
IExample:
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(a) = 6
H(b) = 5
H(c) = 0
H(d) = 11
H(e) = 2
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 12 / 62

Construct Chord - Step 2 (1/2)
IThe
direction starting at the id.
Isucc(x): is the
x.
succ(12) = 0
succ(1) = 2
succ(6) = 6
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 13 / 62

Construct Chord - Step 2 (2/2)
IEach node points to its.
IThe successor of a nodenissucc(n+ 1).
0's successor issucc(1) = 2.
2's successor issucc(3) = 5.
11's successor issucc(12) = 0.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 14 / 62

Construct Chord - Step 3
IWhere to?
IUse globally known hash functionH.
IEach itemhkey; valueigets identierH(key) =k.
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(F atemeh) = 12
H(Cosmin) = 2
H(Seif) = 9
H(Sarunas) = 14
H(T allat) = 4
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 15 / 62

Construct Chord - Step 3
IWhere to?
IUse globally known hash functionH.
IEach itemhkey; valueigets identierH(key) =k.
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(F atemeh) = 12
H(Cosmin) = 2
H(Seif) = 9
H(Sarunas) = 14
H(T allat) = 4
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 15 / 62

Construct Chord - Step 3
IWhere to?
IUse globally known hash functionH.
IEach itemhkey; valueigets identierH(key) =k.
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(F atemeh) = 12
H(Cosmin) = 2
H(Seif) = 9
H(Sarunas) = 14
H(T allat) = 4
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 15 / 62

Construct Chord - Step 3
IWhere to?
IUse globally known hash functionH.
IEach itemhkey; valueigets identierH(key) =k.
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(F atemeh) = 12
H(Cosmin) = 2
H(Seif) = 9
H(Sarunas) = 14
H(T allat) = 4
IStore each item at its successor.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 16 / 62

Construct Chord - Step 3
IWhere to?
IUse globally known hash functionH.
IEach itemhkey; valueigets identierH(key) =k.
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(F atemeh) = 12
H(Cosmin) = 2
H(Seif) = 9
H(Sarunas) = 14
H(T allat) = 4
IStore each item at its successor.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 16 / 62

Construct Chord - Step 3
IWhere to?
IUse globally known hash functionH.
IEach itemhkey; valueigets identierH(key) =k.
SpaceN= 16f0; ;15g
Five nodesa,b,c,d,e.
H(F atemeh) = 12
H(Cosmin) = 2
H(Seif) = 9
H(Sarunas) = 14
H(T allat) = 4
IStore each item at its successor.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 17 / 62

How to Lookup?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 18 / 62

Lookup (1/2)
ITo k:
CalculateH(k).
Follow kis found.
IExample:
Lookup Seif at node 2.
H(Seif) = 9
Traverse nodes: 2, 5, 6, 11
Return Stockholm to initiator
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 19 / 62

Lookup (1/2)
ITo k:
CalculateH(k).
Follow kis found.
IExample:
Lookup Seif at node 2.
H(Seif) = 9
Traverse nodes: 2, 5, 6, 11
Return Stockholm to initiator
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 19 / 62

Lookup (1/2)
ITo k:
CalculateH(k).
Follow kis found.
IExample:
Lookup Seif at node 2.
H(Seif) = 9
Traverse nodes: 2, 5, 6, 11
Return Stockholm to initiator
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 19 / 62

Lookup (1/2)
ITo k:
CalculateH(k).
Follow kis found.
IExample:
Lookup Seif at node 2.
H(Seif) = 9
Traverse nodes: 2, 5, 6, 11
Return Stockholm to initiator
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 19 / 62

Lookup (2/2)
Algorithm 1Ask nodento nd the successor ofid
1:proceduren.ndSuccessor(id)
2:ifpred6= andid2(pred; n]then
3:returnn
4:else ifid2(n; succ]then
5:returnsucc
6:else// forward the query around the circle
7:returnsucc.ndSuccessor(id)
8:end if
9:end procedure
I(a; b]the segment of the ring moving clockwise from but not includ-
ingauntil and includingb.
In.foo(.)denotes an RPC offoo(.)to noden.
In.bardenotes and RPC to fetch the value of the variablebarin
noden.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 20 / 62

Put and Get
Algorithm 2Storevaluewith keyidin the DHT
1:proceduren.put(id; value)
2:n= ndSuccessor(id)
3:s.store(id; value)
4:end procedure
Algorithm 3Retrieve the value of the keyidfrom the DHT
1:proceduren.get(id)
2:n= ndSuccessor(id)
3:returns.retrieve(id)
4:end procedure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 21 / 62

Any Improvement?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 22 / 62

Improvement
ISpeeding up lookups.
IIf only the successor pointers are used:
Worst case lookup time isN, forNnodes.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 23 / 62

Speeding up Lookups (1/2)
IFinger/routing table:
Point tosucc(n+ 1)
Point tosucc(n+ 2)
Point tosucc(n+ 4)
...
Point tosucc(n+ 2
M1
)(N= 2
M
,N: the id space size)
IDistance always
destination.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 24 / 62

Speeding up Lookups (1/2)
IFinger/routing table:
Point tosucc(n+ 1)
Point tosucc(n+ 2)
Point tosucc(n+ 4)
...
Point tosucc(n+ 2
M1
)(N= 2
M
,N: the id space size)
IDistance always
destination.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 24 / 62

Speeding up Lookups (2/2)
IEvery nodenknowssucc(n+ 2
i1
)fori= 1; ; M.
ISize of routing tables is:
Routing table size:M, whereN= 2
M
.
Routing entries =log2(N).
Example:Log2(1000000)20 Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 25 / 62

Lookup Improvement (1/3)
Algorithm 4Ask nodento nd the successor ofid
1:proceduren.ndSuccessor(id)
2:ifpred6= andid2(pred; n]then
3:returnn
4:else ifid2(n; succ]then
5:returnsucc
6:else// forward the query around the circle
7:returnsucc.ndSuccessor(id)
8:end if
9:end procedure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 26 / 62

Lookup Improvement (2/3)
Algorithm 5Ask nodento nd the successor ofid
1:proceduren.ndSuccessor(id)
2:ifpred6= andid2(pred; n]then
3:returnn
4:else ifid2(n; succ]then
5:returnsucc
6:else// forward the query around the circle
7:p closestPrecedingNode(id)
8:returnp.ndSuccessor(id)
9:end if
10:end procedure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 27 / 62

Lookup Improvement (3/3)
Algorithm 6Search locally for the highest predecessor of id
1:procedureclosestPrecedingNode(id)
2:fori=mdownto1do
3:iff inget[i]2(n; id)then
4: returnf inger[i]
5:end if
6:end for
7:end procedure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 28 / 62

Lookups (1/7)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 29 / 62

Lookups (2/7)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 30 / 62

Lookups (3/7)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 31 / 62

Lookups (4/7)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 32 / 62

Lookups (5/7)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 33 / 62

Lookups (6/7)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 34 / 62

Lookups (7/7)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 35 / 62

How to Maintain the Ring?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 36 / 62

Periodic Stabilization (1/2)
IIn Chord, in addition to the
predecessor
Predecessor nis the rst node met in anti-clockwise
direction starting atn.
IPeriodic stabilization
Pointingsuccto closest
successor.
Pointingpredto closest
predecessor.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 37 / 62

Periodic Stabilization (1/2)
IIn Chord, in addition to the
predecessor
Predecessor nis the rst node met in anti-clockwise
direction starting atn.
IPeriodic stabilization
Pointingsuccto closest
successor.
Pointingpredto closest
predecessor.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 37 / 62

Periodic Stabilization (2/2)
Algorithm 7Periodically atn
1:proceduren.stabilize()
2:v succ:pred
3:ifv6= andv2(n; succ]then
4:succ v
5:end if
6:sendnotify(n) tosucc
7:end procedure
Algorithm 8Upon receipt anotify(p)at nodem
1:on receivehnotifyjpifromndo
2:ifpred= orp2(pred; m]then
3: pred p
4:end if
5:end event
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 38 / 62

Handling Join
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 39 / 62

Handling Join
IWhennjoins:
Findn's successor withlookup(n).
Setsuccton's successor.
Stabilization xes the rest.Algorithm 9Join a Chord ring containing node m
1:proceduren.join(m)
2:pred
3:succ m.ndSuccessor(n)
4:end procedure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 40 / 62

Join (1/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 41 / 62

Join (2/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 42 / 62

Join (3/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 43 / 62

Join (4/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 44 / 62

Join (5/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 45 / 62

Fix Fingers (1/4)
IPeriodically, and store the index of the
next nger to x.
Algorithm 10When receivingnotify(p)atn
1:proceduren.xFingers()
2:next next+ 1
3:ifnext > mthen
4:next 1
5:end if
6:f inger[next] ndSuccessor(n2
next1
)
7:end procedure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 46 / 62

Fix Fingers (2/4)
ICurrent situation:succ(N48) =N60
Isucc(212
5
) =succ(53) =N60
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 47 / 62

Fix Fingers (3/4)
Isucc(212
5
) =succ(53) =?
INew nodeN56joins and stabilizes successor pointer.
IFinger 6 of nodeN21is wrong now.
IN21eventually try to x nger 6 by looking up 53 which stops at
N48.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 48 / 62

Fix Fingers (4/4)
Isucc(212
5
) =succ(53) =N56
IN48will eventually stabilize its successor.
IThis means the ring is correct now.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 49 / 62

Handling Failure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 50 / 62

Successor List (1/2)
IA node has a rcontaining the immediater
successors.
succ(n+ 1)
succ(succ(n+ 1) + 1)
succ(succ(succ(n+ 1) + 1) + 1)
IHow big shouldrbe?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 51 / 62

Successor List (1/2)
IA node has a rcontaining the immediater
successors.
succ(n+ 1)
succ(succ(n+ 1) + 1)
succ(succ(succ(n+ 1) + 1) + 1)
IHow big shouldrbe?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 51 / 62

Successor List (1/2)
IA node has a rcontaining the immediater
successors.
succ(n+ 1)
succ(succ(n+ 1) + 1)
succ(succ(succ(n+ 1) + 1) + 1)
IHow big shouldrbe?log2(N)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 51 / 62

Successor List (2/2)
Algorithm 11Join a Chord ring containing nodem
1:proceduren.join(m)
2:pred
3:succ m.ndSuccessor(n)
4:updateSuccesorList(succ.successorList)
5:end procedure
Algorithm 12Periodically atn
1:proceduren.stabilize()
2:succ nd rst alive node in successor list
3:v succ:pred
4:ifv6= andv2(n; succ]then
5:succ v
6:end if
7:sendnotify(n) tosuccupdateSuccessorList(succ.successorList)
8:end procedure
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 52 / 62

Dealing with Failure
IPeriodic stabilization
IIf: replace with closest alive successor
IIf: set predecessor to nil
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 53 / 62

Failure (1/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 54 / 62

Failure (2/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 55 / 62

Failure (3/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 56 / 62

Failure (4/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 57 / 62

Failure (5/5)
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 58 / 62

Summary
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 59 / 62

Summary
IDHTs: distributedhkey; valuei
ILookup service
IPut and Get
IFinger list: improve the lookup
IPeriodically stabilization
ISuccessor list
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 60 / 62

References:
IIon Stoica et al., Chord: A scalable peer-to-peer lookup service for
internet applications, SIGCOMM, 2001.
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 61 / 62

Questions?
Amir H. Payberah (Tehran Polytechnic) DHTs 1393/7/12 62 / 62