- 微博 QQ QQ空间 贴吧
23-Checkpointing and Recovery
1 .CSC2/458 Parallel and Distributed Systems Checkpointing and Recovery Sreepathi Pai April 17, 2018 URCS
2 .Outline Checkpointing and Recovery Independent Checkpointing Coordinated Checkpointing Message Logging
3 .Outline Checkpointing and Recovery Independent Checkpointing Coordinated Checkpointing Message Logging
4 .Errors happen • Errors happen • How do we recover from them (say, for message loss)? • (before information theory): ? • (after information theory): ?
5 .Checkpointing and Recovery To checkpoint is to save the state of a computation so that you can “rollback” to it • Examples: • Save games • Virtual machine snapshots Recovery is then “simply” restoring the checkpoint
6 .Distributed Checkpointing: The Challenge • Processes only know: • which messages they have received • which messages they have sent • what their local state is • Checkpointing ideally should not require everybody to “pause” • Must run concurrently with computation
7 .The Recovery Line Initial state Recovery line Checkpoint P1 Failure P2 Time Message sent Inconsistent collection from P2 to P1 of checkpoints
8 .Outline Checkpointing and Recovery Independent Checkpointing Coordinated Checkpointing Message Logging
9 .Algorithm • A process records its local state independently • messages sent/received included • A recovery for a process entails going back to its most recent checkpoint • Unfortunately, this can’t be done independently
10 .Rollbacks Initial state Checkpoint P1 m* m Failure P2 Time Assume P2 fails. How far we do need to rollback to achieve a consistent worldview?
11 .Detecting dependencies • For a process Pi , let INTi (m) be the interval between the m − 1 and m checkpoints. • All messages sent in INTi (m) contain (i, m) • When process Pj receives this message, it may be in INTj (n) • records dependency INTi (m) → INTj (n) • saves dependency with checkpoint
12 .Rolling back: Consistency • If Pi rolls back to checkpoint m − 1, no messages from INTi (m) were ever sent • All checkpoints dependent on INTi (m) are invalid • Rollbacks need to continue until consistency is reached
13 .Outline Checkpointing and Recovery Independent Checkpointing Coordinated Checkpointing Message Logging
14 .Algorithm • Coordinator broadcasts CHECKPOINT-REQUEST message to all processes • When this request is received, • Process checkpoints local state • Acknowledges to coordinator that it has taken checkpoint and waits • When coordinator receives acknowledgements from all processes, it sends CHECKPOINT-DONE • Processes resume computation • What about messages?
15 .Message handling • All incoming messages received after CHECKPOINT-REQUEST are not considered part of the checkpoint • All outgoing messages are held back until CHECKPOINT-DONE is received • This results in a “globally consistent state” • How?
16 .Outline Checkpointing and Recovery Independent Checkpointing Coordinated Checkpointing Message Logging
17 .Basic idea • Computations are deterministic and rely only on messages transmitted • Save messages from a checkpoint and replay them during recovery
18 .Piecewise deterministic execution • A piecewise deterministic computation interval: • starts with a non-deterministic event (e.g. receipt of a message) • continues in a completely deterministic fashion • ends just before another non-deterministic event This implies that only non-deterministic events need to be logged.
19 .Who should save the messages? Q crashes and recovers P m1 m1 m2 is never replayed, so neither will m3 Q m3 m2 m3 m2 R Unlogged message Time Logged message
20 .Orphan processes • Let DEP(m) represent processes that depend on message m • Let COPY (m) represent processes that contain a copy of m • but may not have logged it • Note, m contains all details necessary to retransmit it A process Q is orphaned if and only if: • Q depends on m (i.e. Q ∈ DEP(m)) • All processes in COPY (m) have failed • So m cannot be played back
21 .Pessimistically avoiding orphan processes • Orphan processes can be avoided by ensuring that • A non-deterministic message is sent only to one process • That process cannot send another message without logging m
22 .Further reading Chandy and Lamport, “Distributed Snapshots: Determining Global States of Distributed Systems”, ACM TOCS 1985
23 .Acknowledgments All figures from Van Steen and Tanenbaum, Distributed Systems, 3rd Edition, Chapter 8.