unit 3 all about the data science process, covering the steps present in almost every data science project.

Unit IV Recovery & Consensus Chapter 13 and 14 1

Planned Hour Recovery & Consensus Relevant CO Nos Highest Cognitive level** 1 Checkpointing and rollback recovery: Introduction CO5 K2 1 Checkpoint-based recovery CO5 K2 1 Log-based rollback recovery CO5 K2 1 Coordinated checkpointing algorithm CO5 K2 1 Algorithm for asynchronous checkpointing and recovery. CO5 K2 1 Consensus and agreement algorithms: Problem definition CO5 K2 1 Overview of results CO5 K2 1 Agreement in a failure –free system CO5 K2 1 Agreement in synchronous systems with failures. CO5 K2

Distributed systems are subject to failures. And the vast computing potential of these systems is often hampered by their susceptibility to failures Fault tolerance Offers the abstraction of a reliable system that tolerates faults Fault-recovery Once a fault occurs , we want to recover to a previous correct state Introduction

Treat a distributed system application as a collection of processes that communicate over a network

Rollback Recovery Restore the system back to a consistent state after a failure Achieve fault tolerance by periodically saving the state of a process during the failure-free execution

Checkpointing Checkpointing is a technique that provides fault tolerance for computing systems. It basically consists of saving a snapshot of the application's state, so that applications can restart from that point in case of failure The saved states of a process

Why is rollback recovery of distributed systems complicated? – messages induce inter-process dependencies during failure-free operation Rollback propagation – The dependencies may force some of the processes that did not fail to roll back – This phenomenon is called “domino effect”

M 1 P 1 P 2 Rollback Scenario This phenomenon of cascaded rollback is called the domino effect Propagation may extend back to the initial state of the computation, losing all the work performed before the failure. P 1 Fails The dependencies may force some of the processes that did not fail also to roll back

Dominos effect m1 P i P j P k Fails P k m2 m3 m4 m5 m6 Recovery Line

Dominos effect m1 P i P j P k Fails P k m2 m3 m6 m8 m7 Recovery Line m4 m5

If each process takes its checkpoints independently, then the system can not avoid the domino effect – This scheme is called independent or uncoordinated checkpointing Techniques that avoid domino effect Coordinated checkpointing rollback recovery. P rocesses coordinate their checkpoints to form a system-wide consistent state Communication-induced checkpointing rollback recovery F orces each process to take checkpoints based on information piggybacked on the application

Log-based rollback recovery Combines checkpointing with logging of non-deterministic events . Relies on piecewise deterministic ( pwd ) assumption In general it enables a system to recover beyond the most recent set of consistent checkpoints. Particularly attractive for applications that frequently interact with the outside world that cannot rollback

Taxonomy

Background and definitions System model Fixed number N of processes Processes cooperate to execute a distributed application and interact with the outside world by receiving and sending input and output messages, respectively Outside World Process (OWP)

A local checkpoint All processes save their local states at certain instants of time A local check point is a snapshot of the state of the process at a given instance Assumption – A process stores all local checkpoints on the stable storage – A process is able to roll back to any of its existing local checkpoints

Consistent states A global state of a distributed system A collection of the individual states of all participating processes and the states of the communication channels Consistent global state A global state that may occur during a failure-free execution of distribution of distributed computation If a process‟s state reflects a message receipt, then the state of the corresponding sender must reflect the sending of the message

A global checkpoint A set of local checkpoints, one from each process A consistent global checkpoint A global checkpoint such that no message is sent by a process after taking its local point that is received by another process before taking its checkpoint

The dotted line - recovery line The parallel lines - Checkpoints

Type of messages A process failure and subsequent recovery may leave messages that were perfectly received (and processed) before the failure in abnormal states A rollback of processes for recovery may have to rollback the send and receive operations of several messages M 1 P 1 P 2 P 1 Fails

In-transit messages Messages sent, not yet received before recovery line No problem with global state consistency Can be lost or delayed`

P i P j P i Fails P k m1 Recovery Line m5 m2 m4 m3

Lost messages Messages whose send is done but receive is undone due to rollback ( After recovery ) . It must be replayed If implemented over unreliable communication, application is responsible reliable communication, recovery algorithm is responsible

Delayed messages Messages whose receive is not recorded because the receiving process was either down or the message arrived after the rollback of the receiving process Duplicated Msg

Orphan messages Messages with “receive‟ recorded but message “send‟ not recorded Do not arise if processes roll back to a consistent global state Duplicate messages Arise due to message logging and replaying during process recovery

Issues in Failure recovery W e must not only restore the system to a consistent state , but also appropriately handle messages that are left in an abnormal state due to the failure and recovery. The computation comprises of three processes Pi, Pj , and Pk , connected through a communication network. The processes communicate solely by exchanging messages over fault-free, FIFO communication channels.

Issues in Failure recovery m1 P i P j P 1 Fails P k m2 m3 m4 m5 Ci,0 Cj,0 Ck,0 Ci,1 Cj,1 Ck,1 Ck,2 Cj,2 The rollback of process 𝑃𝑖 to checkpoint 𝐶𝑖,1 created an orphan message m4 Orphan message m5 is created due to the roll back of process 𝑃𝑗 to checkpoint 𝐶j,2 m6 Checkpoints : { 𝐶𝑖,0, 𝐶𝑖,1}, { 𝐶𝑗,0, 𝐶𝑗,1, 𝐶𝑗,2}, and { 𝐶𝑘,0, 𝐶𝑘,1, 𝐶𝑘,2}

Issues in Failure recovery m1 P i P j P 1 Fails P k m2 m3 m4 m5 Ci,0 Cj,0 Ck,0 Ci,1 Cj,1 Ck,1 Ck,2 Cj,2 Message m6: a lost message since the send event for m6 is recorded in the restored state for 𝑃𝑗 , but the receive event has been undone at process 𝑃𝑖 . m6

Issues in Failure recovery m1 P i P j P 1 Fails P k m2 m3 m4 m5 Ci,0 Cj,0 Ck,0 Ci,1 Cj,1 Ck,1 Ck,2 Cj,2 Messages m7 & m8: delayed orphan messages . After resuming execution from their checkpoints, processes will generate both of these messages m6 m7 m8 Solution : Log of all the sent messages

Uncoordinated Checkpointing Each process has autonomy in deciding when to take checkpoints positions Advantage No synchronization overhead and it allows processes to take checkpoints when it is most convenient or efficient L ower runtime overhead during normal execution

Disadvantages Domino effect during a recovery Possible useless checkpoints Recovery from a failure is slow - because processes need to iterate to find a consistent set of checkpoints - useless Each process maintains multiple checkpoints and periodically invoke a garbage collection algorithm. Garbage collection is needed Not suitable for applications with outside world interaction (output commit)

Direct Dependency Tracking Technique i is the process ID x is the checkpoint index x th checkpoint of process P i , P j P i Cj,0 Ci,0 Cj,1 Ci,1 Cj,y-1 Ci,x-1 Cj,y Ci,x m i ,x Checkpoint Interval

Coordinated Checkpointing Coordinated Checkpointing requires processes to orchestrate their checkpoints in order to form a consistent global state. Advantages No domino effect Each process is required to maintain only one checkpoint (the last) No computation of recovery line is required Disadvantages Large latency in committing output Two approaches Blocking Non-blocking

2 Phase coordinated Checkpointing protocol The computation is blocked during the Checkpointing

Non-blocking checkpoint coordination Need not stop their execution while taking checkpoints. Fundamental Problem - Prevent a process from receiving application messages that could make the checkpoint inconsistent P i Initiator Ci,x Cj,x Pj m

Idea - Chandy and lamport snapshot algorithm in which markers play the role of the checkpoint request messages The initiator takes a checkpoint and sends a marker (a checkpoint request) on all outgoing channels P i Initiator Ci,x Pj m Cj,x Each process takes a checkpoint upon receiving the first marker A nd sends the marker on all outgoing channels before sending any application message

Communication-induced checkpointing Processes may be forced to take additional checkpoints and thus process independence is constrained to guarantee the eventual progress of the recovery line . Autonomous and Forced checkpoints – Local checkpoints and forced checkpoints Communication-induced checkpointing piggybacks protocol-related information on each application message – The receiver determine if it has to take a forced checkpoint to advance the global recovery line.

The forced checkpoint must be taken before the application may process the contents of the message In contrast with coordinated checkpointing , no special coordination messages are exchanged Two types of communication-induced checkpointing – model-based checkpointing and - index-based checkpointing .

Log-based Rollback Recovery A log-based rollback recovery makes use of deterministic and nondeterministic events in a computation . A non-deterministic - receipt of a message from another process or an event internal to the process .(m0, m3, m7) A message send event is not a non-deterministic event.

This assumes that all non-deterministic events can be identified and their corresponding determinants can be logged into the stable storage. How it works Each process logs on stable storage the determinants of all non-deterministic events Additionally , checkpoints are taken to reduce the extent of rollback during recovery

The no-orphans consistency condition Let e be a non-deterministic event that occurs at process p . We define the following Depend(e ) –set of processes affected by event e Log(e ) –set of processes with e logged on volatile memory Stable(e ) –set of processes with e logged on stable storage

Pessimistic logging A ssume that a failure can occur after any non-deterministic event in the computation. Determinant is logged to stable storage before message is delivered Pessimistic protocols implement the following property, often referred to as synchronous logging. I f an event has not been logged on the stable storage, then no process can depend on it. It is a stronger version of Always-no-orphan

Suppose processes 𝑃 2 fail as shown, restart from checkpoint , and roll forward using their determinant logs to deliver again the same sequence of messages as in the pre-failure execution Once the recovery is complete, the processes p2 will be consistent with the state of 𝑃1 that includes the receipt of message 𝑚2 from 𝑃1

Optimistic Logging Processes log determinants asynchronously to the stable storage Optimistically assume that logging will be complete before a failure occurs. Determinants are kept in a volatile log , and are periodically flushed to the stable storage This do not implement the always-no-orphans condition.

To perform rollbacks correctly, optimistic logging protocols track causal dependencies during failure free execution Pessimistic protocols need only keep the most recent checkpoint of each process , whereas optimistic protocols may need to keep multiple checkpoints for each process

Causal logging Causal logging combines the advantages of both pessimistic and optimistic logging Like optimistic logging , it does not require synchronous access to the stable storage except during output commit Like pessimistic logging , it allows each process to commit output independently and never creates orphans , thus isolating processes from the effects of failures at other processes Make sure that the always-no-orphans property holds Each process maintains information about all the events that have causally affected its state

Process P 0 at state X will have logged the determinants of the non-deterministic events that causally precede its state according to Lamport’s happened-before relation.

Koo– Toueg coordinated checkpointing algorithm Assumptions made: Processes communicate by exchanging messages through communication channels (FIFO) End-to-end protocols – cope with message loss due to rollback recovery and communication failure Communication failures do not partition the network. Two kinds of checkpoints: permanent and tentative Local checkpoint, part of a consistent global checkpoint Temporary checkpoint, become permanent checkpoint when the algorithm terminates successfully

Two phases of the algorithm First Phase:

Pi informs all the processes of the decision it reached at the end of the first phase . A process, on receiving the message from Pi, will act accordingly . A set of permanent checkpoints taken by this algorithm is consistent because of the following two reason. Either all or none of the processes take permanent checkpoint No process sends message after taking permanent checkpoint Two phases of the algorithm Second Phase:

An optimization Consistent Checkpoint X decides to initiate the checkpointing algorithm after receiving message m

The rollback recovery algorithm The rollback recovery algorithm restores the system state to a consistent state after a failure 2 phases The initiating process send a message to all other processes and ask for the preferences – restarting to the previous checkpoints . All need to agree about either do or not. The initiating process send the final decision to all processes , all the processes act accordingly after receiving the final decision.

Juang – Venkatesan algorithm for asynchronous checkpointing and recovery Assumptions : Communication channels are reliable, Delivery messages in FIFO order, infinite buffers, Message transmission delay is arbitrary but finite Underlying computation/application is event-driven: process P is at state s, receives message m, processes the message, moves to state s’ and send messages out. So the triplet ( s, m, msgs_sent ) represents the state of P

Two type of log storage are maintained: Volatile log: short time to access but lost if processor crash. Move to stable log periodically. Stable log: longer time to access but remained if crashed Asynchronous checkpointing : After executing an event, the triplet is recorded without any synchronization with other processes. Local checkpoint consist of set of records, first are stored in volatile log, then moved to stable log.

Recovery algorithm –Notations: 𝑅𝐶𝑉𝐷𝑖 ←𝑗(𝐶𝑘𝑃𝑡𝑖) : number of messages received by 𝑝𝑖 from 𝑝𝑗, from the beginning of computation to checkpoint 𝐶𝑘𝑃𝑡𝑖 𝑆𝐸𝑁𝑇𝑖 →𝑗(𝐶𝑘𝑃𝑡𝑖) : number of messages sent by 𝑝𝑖 to 𝑝𝑗, from the beginning of computation to checkpoint 𝐶𝑘𝑃𝑡𝑖 –Idea: From the set of checkpoints, find a set of consistent checkpoints Doing that based on the number of messages sent and received Orphan  if RCVDi← jCkPti > SENTj→iCkPtj

Orpahan Message RCVDx←yCkPtx > SENTy→xCkPty Note that if X and Z rolls back to checkpoint ex1 and ez1, then it will be consistent with Y ’s state, ey1. Y Crashes 2>1

If event ey2 is the latest checkpointed event at Y , then Y will restart from the state corresponding to ey2 Because of the broadcast nature of ROLLBACK messages, the recovery algorithm is also initiated at processors X and Z Y Crashes Juang – Venkatesan algorithm

Initially X , Y , and Z set CkPtX ← ex3, CkPtY ← ey2 and CkPtZ ← ez2,

And the following messages during the first iteration: Y sends ROLLBACK(Y, 2) to X and ROLLBACK(Y, 1) to Z; X sends ROLLBACK(X, 2) to Y and ROLLBACK(X, 0) to Z ; and Z sends ROLLBACK(Z , 0) to X and ROLLBACK(Z, 1) to Y .

RCVDX← Y ( CkPtX ) = 3 > 2 (2 is the value received in the ROLLBACK(Y , 2) message from Y ), X will set CkPtX to ex2 satisfying RCVDX ←Y (ex2) = 2 ≤ 2

Since RCVDZ←Y ( CkPtZ ) = 2 > 1, Z will set CkPtZ to ez1 satisfying RCVDZ←Y (ez1) = 1 ≤ 1. At

In the second iteration, Y sends ROLLBACK(Y, 2) to X and ROLLBACK(Y, 1) to Z; Z sends ROLLBACK(Z, 1) to Y and ROLLBACK(Z, 0) to X; X sends ROLLBACK(X, 0) to Z and ROLLBACK(X , 1) to Y {ex2, ey2, ez1}, is consistent, and no further rollback occurs.

ALGORITHM

Consensus and Agreement Agreement among the processes in a distributed system is a fundamental requirement for a wide range of applications . Example - commit decision in database systems A distributed key-value store had multiple nodes Sys 1# “Transfer 1000 From A to B” Sys 2 # “Transfer 3000 From A to B”

Classification of Faults Based on the components that fail Program / Process Processor / machine Link Storage Based on the behaviour of faulty components Crash – Just halts Fail-stop – crash with additional conditions Omission – process or channel fails to do something that it is expected Byzantine – behaves arbitrarily - imperfect information Timing – correct output, but provided outside a specified time interval

S ome assumptions underlying our study of agreement algorithms: Failure models Among the n processes in the system, at most f processes can be faulty . A faulty process can behave in any manner allowed by the failure model assumed . The various failure models – fail-stop, send omission and receive omission, and Byzantine failures

Synchronous communication Process run in lock step manner . Process receives a message sent to it earlier, performs computation and sends a message to other process One step of synchronous computation is called round asynchronous communication Computation does not proceed in lock step Process can send or receive and perform computation at any time Network connectivity The system has full logical connectivity, i.e., each process can communicate with any other by direct message passing

Sender identification A process that receives a message always knows the identity of the sender process . Channel reliability The channels are reliable, and only the processes may fail Authenticated vs. non-authenticated messages Using authentication via techniques such as digital signatures, it is easier to solve the agreement problem it can forge the message or it can also tamper with the contents Agreement variable The agreement variable may be boolean or multivalued, and need not be an integer

Byzantine Generals Problem Four camps of the attacking army, each commanded by a general, are camped around the fort of Byzantium

Byzantine Generals Problem They can succeed in attacking only if they attack simultaneously. Generals should reach a consensus on the plan ● It could be ATTACK

Byzantine Generals Problem Generals should reach a consensus on the plan ● Or RETREAT

The only way they can communicate is to send messengers among themselves. An asynchronous system is modeled by messengers taking an unbounded time to travel between two camps . A lost message is modeled by a messenger being captured by the enemy

Byzantine Generals Problem But there might be traitors  All loyal generals should reach a consensus

Byzantine Generals Problem But traitors can act Arbitrarily  All loyal generals should reach a consensus

The traitor will attempt to subvert the agreement-reaching mechanism, by giving misleading information to the other generals

The Solution A commanding general ( designated process, called the source process) sends an order to his n-1 lieutenant generals such that

The Solution IC1  All loyal lieutenants obey the same order. IC2  If the commanding general is loyal , then every loyal lieutenant obeys the order he sends.

The Solution Consistency/Agreement IC2  If the commanding general is loyal , then every loyal lieutenant obeys the order he sends. Agreement: All non-faulty processes must agree on the same value

The Solution Consistency/Agreement Validity Termination/ Liveness Validity: If the source process is non-faulty, then the agreed upon value by all the non-faulty processes must be the same as the initial value of the source. Termination: Each non-faulty process must eventually decide on a value.

The consensus problem The consensus problem differs from the Byzantine agreement problem in that each process has an initial value and all the correct processes must agree on a single value . Agreement All non-faulty processes must agree on the same ( single) value . Validity If all the non-faulty processes have the same initial value, then the agreed upon value by all the non-faulty processes must be that same value . Termination Each non-faulty process must eventually decide on a value.

The interactive consistency problem The interactive consistency problem differs from the Byzantine agreement problem in that each process has an initial value, and all the correct processes must agree upon a set of values, with one value for each process. Agreement All non-faulty processes must agree on the same array of values A[v1… vn ]. Validity If process i is non-faulty and its initial value is vi, then all non faulty processes agree on vi as the ith element of the array A. If process j is faulty, then the non-faulty processes can agree on any value for Aj . Termination Each non-faulty process must eventually decide on the array A.

86

Overview of results Overview of results on agreement. f denotes number of failure-prone processes. n is the total number of processes.

Consensus is not solvable in asynchronous systems even if one process can fail by crashing Asynchronous message passing and shared memory deal with trying to solve consensus

Impossibility of achieving Byzantine agreement with n = 3 processes and f = 1 malicious process. With n = 3 processes, the Byzantine agreement problem cannot be solved if the number of Byzantine processes f = 1. 89

Agreement in a failure-free system The consensus can be reached by collecting information - arriving at a “decision,” and distributing this decision. The decision can be reached by - majority, max, and min functions In a synchronous system, this can be done simply in a constant number of rounds In an asynchronous system, consensus can similarly be reached in a constant number of message hops

Agreement in (message-passing) synchronous systems with failures Up to f (< n) crash failures possible. In f + 1 rounds, at least one round has no failures. Now justify: agreement, validity, termination conditions are satisfied . Complexity : O(f + 1)n2 messages f + 1 is lower bound on number of rounds

The correct generals reach agreement in two rounds of messages: 1 . In the first round, the commander sends a value to each of the lieutenants. 2 . In the second round, each of the lieutenants send the value it receives to its peers. When both rounds are finished the correct lieutenants simply apply a majority function to determine the value If the commander is faulty, all the lieutenants faithfully report the values they received but there will be no agreement. If one of the lieutenants is faulty, the correct lieutenants will receive N −2 replicas of the correct value, plus one incorrect one: the majority function will determine the correct value.

Consensus with up to f fail-stop processes in a system of n processes, n > f

unit 3 all about the data science process, covering the steps present in almost every data science project.

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

unit 3 all about the data science process, covering the steps present in almost every data science project.

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Slide 55

Slide 56

Slide 57

Slide 58

Slide 59

Slide 60

Slide 61

Slide 62

Slide 63

Slide 64

Slide 65

Slide 66

Slide 67

Slide 68

Slide 69

Slide 70

Slide 71

Slide 72

Slide 73

Slide 74

Slide 75

Slide 76

Slide 77