“Distributed Algorithms”
by Nancy A. Lynch
SHARED MEMORY vs NETWORKS
Presented By: Sumit Sukhramani
Kent State University
Transformation from the Shared Memory Model
to the Network Model
Correctness conditions satisfied by the
Transformations.
Non fault Tolerant Strategies.
Fault Tolerant Strategies.
Correctness Conditions
The general problem is to design an asynchronous send/receive
network system B with processes P
i , 1<=i<=n that is an I
simulation of A.
The correctness conditions are:
αand άare indistinguishable to U.
For each I, a stop occurs in αonly if stop occurs in ά.
Strategies Assuming no Failures
Classification of simple strategies:
Single-copy scheme
*
Multi-copy schemes *
*Based on number of copies of the shared variables
Single-Copy Schemes
SimpleShVarSim Algorithm
Architecture of SimpleShVarSim
Q
1
Q
2
R
x,1
R
y,1
R
y,2
R
x,2
MajorityVotingObject algorithm
Lemma17.2 The MajorityVotingObject algorithm is a read/write
atomic object.
The write operations obtain tags 1,2,…, in the order of their
serialization points.
Each read or embedded read obtains the largest tag that has
been written by a write operation serialized before it, together
with the accompanying value.
Theorem 17.3: Suppose that A uses read/write shared variables.
Then the MajorityVoting algorithm based on A is a 0-simulation
of A.
Lack of Fault Tolerance in MajorityVotingObject
The standard transaction implementations are not fault tolerant
for an atomic object x.
Example: A process performing a read transaction might send
out messages to read a majority of the copies to become locked.
The same process might fail without releasing its locks which
would prevent any later write transaction from ever obtaining the
locks it requires.
Solution: Use timeouts to detect process failures.
Algorithm Tolerating Process Failures
// Assumption is that majority of the processes do not fail n > 2f //
// Assumption is that the network is reliable //
// Considering only the case of single-writer/multi-reader read/write
shared memory.
Implementation involves that each read/write shared variable x,
of a read/write atomic object guaranteeing f-failure termination.
It also involves assuming operations on specific ports.
The central concept is that the result of each write is stored at a
majority of the nodes in the network before the write completes.
ABDObject algorithm (Write)
Each of the processes maintains a copy of x together with a tag.
When writer process wants to perform a write(v) on x, it lets t be
the smallest tag that it has not yet assigned to any write. Then it
sets its local copy of x and local tag to v and t, respectively, and
sends (“write”, v, t) messages to all the other processes.
A process receiving this message updates its copy of x and the
tag (if the tag is greater than the current tag). Finally it sends an
acknowledgement to the writer.
When the writer knows that majority of processes have their tag
values equal to t, it returns ack.
ABDObject algorithm (Read)
When a process wants to read x, it sends read messages to all
the other processes and also reads its own value of x and its
own tag. A process receiving this message responds with the
latest value of x and tag.
When the process has learned the x and tag values of a majority
of the processes, it returns the value of x along with the largest
tag t.
The process also updates its own value of variable and the tag.
Finally the processes after updating their value of x and tag
send an ack.
Theorem 17.4
The ABDObject algorithm for n > 2f is a read/write atomic object
guaranteeing f-failure termination.
The algorithm is well formed and and has f-failure termination
because each operation requires only majority of processes to
decide. Serialization can be shown from the following properties:
If a write πwith tag = t completes before a read ø is invoked,
then ø obtains a tag that is at least as large as t.
If read πcompletes before read ø is invoked, then the tag
obtained by ø is at least as great as that obtained by π.
Impossibility result for n/2 failures
ABD algorithm does not tolerate f failures if n <= 2f. This is because
the failure of this many processes make the other processes
permanently unable to secure the majorities.
Theorem 17.6 Let n = m + p where m, p >=1, and suppose that n
<= 2f. Then there is no algorithm in the asynchronous broadcast
model that implements a read/write atomic object with m writers
and p readers, guaranteeing f-failure termination.
Proving by contradiction.
The theorem implies that for any fixed n and f where n >= 2 and f
>= n/2 there can be no general method for producing f
simulations if n process shared memory algorithms even if the
underlying shared variables are restricted to be single
writer/single reader registers.
Transformations from the Network to the
Shared Memory Model
In this transformation there is no special requirement on the
number of failures and this works even if n <= 2f. The
constructions are simpler.
The reason for this is that the asynchronous model is more
powerful than the asynchronous network model. The reason
behind this is the availability of reliable shared memory.
Send/Receive Systems (single-writer/single-reader
registers)
The general problem is to produce a shared memory system B with
n processes using single writer/single registers that simulate
A.
SimpleSRSim algorithm
B includes a single writer/single reader shared variable x(i,j)
writable by process i and readable by process j. It contains a
queue of messages initially empty. Process i only adds
messages to the queue, no removals happen.
From time to time process i checks all the incoming variables
x(j,i) in order to find if any new messages have been placed.
Finally process i handles the messages as P
idoes.
Broadcast Systems(single-writer/multi-reader
registers)
SimpleBcastSim algorithm
B includes a single writer/multi reader shared variable x(i)
writable by I and readable by all processes. It also contains a
queue of messages initially empty.
Process I adds the message m to the end of the queue in the
variable x(i). From time to time process i checks all the variables
x(j) including x(i) in order to find if any new messages have been
placed.
Finally process i handles the messages as P
idoes