19 Distributed Consensus #2

Distributed Consensus #2 The Byzantine Generals Problem Practical Byzantine Fault Tolerance

1. Today’s Papers • The Byzantine Generals Problem, Leslie Lamport, Robert Shostak, and EECS 262a Marshall Pease. Appears in ACM Transactions on Programming Languages Advanced Topics in Computer Systems and Systems (TOPLAS), Vol. 4, No. 3, July 1982, pp 382-401 • Practical Byzantine Fault Tolerance, M. Castro and B. Liskov. Appears In Lecture 19 Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 1999. Byzantine Agreement October 30th, 2018 • Thoughts? John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs262 10/30/2018 cs262a-F18 Lecture-19 2 Motivation Problem Definition • Coping with failures in computer systems • Generals = Computer Components • Failed component sends conflicting information to • The abstract problem… different parts of system. – Each division of Byzantine army is directed by its own general. – There are n Generals, some of which are traitors. • Agreement in the presence of faults. – All armies are camped outside enemy castle, observing enemy. • P2P Networks? – Communicate with each other by messengers. – Requirements: – Good nodes have to “agree to do the same thing”. » G1: All loyal generals decide upon the same plan of action – Faulty nodes generate corrupted and misleading » G2: A small number of traitors cannot cause the loyal generals to messages. adopt a bad plan – Non-malicious: Software bugs, hardware failures, – Note: We do not have to identify the traitors. power failures – Malicious reasons: Machine compromised. 10/30/2018 cs262a-F18 Lecture-19 3 10/30/2018 cs262a-F18 Lecture-19 4

2. Recall: The Path of an OceanStore Update Recall: Archival Dissemination of Fragments 10/30/2018 cs262a-F18 Lecture-19 5 10/30/2018 cs262a-F18 Lecture-19 6 Differing Degrees of Responsibility Naïve solution • Inner-ring provides quality of service • ith general sends v(i) to all other generals – Handles of live data and write access control • To deal with two requirements: – Focus utility resources on this vital service – All generals combine their information v(1), v(2), .., v(n) – Compromised servers must be detected quickly in the same way – Byzantine Agreement important here! – Majority (v(1), v(2), …, v(n)), ignore minority traitors • Caching service can be provided by anyone – Data encrypted and self-verifying • Naïve solution does not work: – Pay for service “Caching Kiosks”? – Traitors may send different values to different generals. • Archival Storage and Repair – Loyal generals might get conflicting values from traitors – Read-only data: easier to authenticate and repair • Requirement: Any two loyal generals must use the same – Tradeoff redundancy for responsiveness value of v(i) to decide on same plan of action. • Could be provided by different companies! 10/30/2018 cs262a-F18 Lecture-19 7 10/30/2018 cs262a-F18 Lecture-19 8

3. Reduction of General Problem 3-General Impossibly Example • Insight: We can restrict ourselves to the problem of one general sending its order to others. • 3 generals, 1 traitor among them. • Byzantine Generals Problem (BGP): • Two messages: Attack or Retreat – A commanding general (commander) must send an order to his • Shaded – Traitor n-1 lieutenants. • L1 sees (A,R). Who is the traitor? C or L2? • Interactive Consistency Conditions: • Fig 1: L1 has to attack to satisfy IC2. – IC1: All loyal lieutenants obey the same order. • Fig 2: L1 attacks, L2 retreats. IC1 violated. – IC2: If the commanding general is loyal, then every loyal lieutenant obeys the order he sends. • Note: If General is loyal, IC2  IC1. • Original problem: each general sends his value v(i) by using the above solution, with other generals acting as lieutenants. 10/30/2018 cs262a-F18 Lecture-19 9 10/30/2018 cs262a-F18 Lecture-19 10 General Impossibility Solution I – Oral Messages • If there are 3m+1 generals, solution allows up to m traitors. • In general, no solutions with fewer than 3m+1 • Oral messages – the sending of content is entirely under generals can cope with m traitors. the control of sender. • Proof by contradiction. • Assumptions on oral messages: – A1 – Each message that is sent is delivered correctly. – Assume there is a solution for 3m Albanians with m traitors. – A2 – The receiver of a message knows who sent it. – A3 – The absence of a message can be detected. – Reduce to 3-General problem. • Assures: – Traitors cannot interfere with communication as third party. – Traitors cannot send fake messages - Solution to 3m – Traitors cannot interfere by being silent. problem => Solution • Default order to “retreat” for silent traitor. to 3-General problem!! 10/30/2018 cs262a-F18 Lecture-19 11 10/30/2018 cs262a-F18 Lecture-19 12

4. Oral Messages (Cont) Restate Algorithm • Algorithm OM(0) – Commander send his value to every lieutenant. • OM(M): – Each lieutenant (L) use the value received from commander, or – Commander sends out command. RETREAT if no value is received. – Each lieutenant acts as commander in OM(m-1). • Algorithm OM(m), m>0 Sends out command to other lieutenants. – Commander sends his value to every Lieutenant (vi) – Use majority to compute value based on commands – Each Lieutenant acts as commander for OM(m-1) and sends vi to received by other lieutenants in OM(m-1) the other n-2 lieutenants (or RETREAT) – For each i, and each j ≠ i, let vj be the value lieutenant i receives • Revisit Interactive Consistency goals: from lieutenant j in step (2) using OM(m-1). Lieutenant i uses the – IC1: All loyal lieutenants obey the same command. value majority (v1, …, vn-1). – Why j ≠ i? “Trust myself more than what others said I said.” – IC2: If the commanding general is loyal, then every loyal lieutenant obeys the command he sends. 10/30/2018 cs262a-F18 Lecture-19 13 10/30/2018 cs262a-F18 Lecture-19 14 Example (n=4, m=1) Example (n=4, m=1) • Algorithm OM(1): L3 is a traitor. • Algorithm OM(1): Commander is a traitor. • L1 and L2 both receive v,v,x. (IC1 is met.) • All lieutenants receive x,y,z. (IC1 is met). • IC2 is met because L1 and L2 obeys C • IC2 is irrelevant since commander is a traitor. 10/30/2018 cs262a-F18 Lecture-19 15 10/30/2018 cs262a-F18 Lecture-19 16

5. Expensive Communication Solution II: Signed messages • OM(m) invokes n-1 OM(m-1) • Previous algorithm allows a traitor to lie about the • OM(m-1) invokes n-2 OM(m-2) commander’s orders (command). We prevent that with signatures to simplify the problem. • OM(m-2) invokes n-3 OM(m-3) • By simplifying the problem, we can cope with any • … number of traitors as long as their maximum number • OM(m-k) will be called (n-1)…(n-k) times (m) is known. • O(nm) – Expensive! • Additional Assumption A4: – A loyal general’s signature cannot be forged. – Anyone can verify authenticity of general’s signature. • Use a function choice(…) to obtain a single order – choice(V) = v if v if the only elem. in V – choice(V) = RETREAT if V is empty 10/30/2018 cs262a-F18 Lecture-19 17 10/30/2018 cs262a-F18 Lecture-19 18 Signed Messages (Cont) Algorithm’s Intuition • Each lieutenant maintains a set V of properly signed orders received so far. • All loyal lieutenants compute the same set of V • The commander sends a signed order to lieutenants eventually, thus choice(V) is the same (IC1) • A lieutenant receives an order from someone (either from • If the commander is loyal, the algorithm works commander or other lieutenants), because all loyal lieutenants will have the properly signed orders by round 1 (IC2) – Verifies authenticity and puts it in V. – If there are less than m distinct signatures on the order • What if the commander is not loyal? » Augments orders with signature » Relays messages to lieutenants who have not seen the order. • When lieutenant receives no new messages, and use V = “attack, retreat” => Commander is a traitor. choice(V) as the desired action. • If you want to protect against more traitors, increase m 10/30/2018 cs262a-F18 Lecture-19 19 10/30/2018 cs262a-F18 Lecture-19 20

6. Missing Communication Paths Is this a good paper? • What if not all generals can reach all other generals • What were the authors’ goals? directly? • What about the evaluation/metrics? • P-regular graph – Each node has p regular neighbors. • Did they convince you that this was a good • 3m-regular graph has minimum of 3m+1 nodes system/approach? • Paper shows algorithm for variant of oral message • Were there any red-flags? algorithm – OM(m,p). Essentially same algorithm except • What mistakes did they make? that each lieutenant forwards orders to neighbors. • Does the system/approach meet the “Test of Time” • Proofs that OM(m,3m) solves BGP for at most m traitors. challenge? • I.e. if the communication graph is 3m-regular, and there • How would you review this paper today? are at most m traitors, the problem can still be solved. 10/30/2018 cs262a-F18 Lecture-19 21 10/30/2018 cs262a-F18 Lecture-19 22 Bad Assumption: Benign Faults • Traditional replication assumes: – replicas fail by stopping or omitting steps • Invalid with malicious attacks: – compromised replica may behave arbitrarily – single fault may compromise service – decreased resiliency to malicious attacks client BREAK server attacker replaces replicas replica’s code 10/30/2018 cs262a-F18 Lecture-19 23 10/30/2018 cs262a-F18 Lecture-19 24

7. BFT Tolerates Byzantine Faults Byzantine-Faulty Clients • Bad assumption: client faults are benign • Byzantine fault tolerance: – clients easier to compromise than replicas – no assumptions about faulty behavior • BFT tolerates Byzantine-faulty clients: • Tolerates successful attacks – access control – service available when hacker controls replicas – narrow interfaces attacker replaces – enforce invariants client client’s code server server attacker replaces replicas replicas replica’s code • Support for complex service operations is important 10/30/2018 cs262a-F18 Lecture-19 25 10/30/2018 cs262a-F18 Lecture-19 26 Bad Assumption: Synchrony Asynchrony • Synchrony  known bounds on: • No bounds on delays – delays between steps • Problem: replication is impossible – message delays • Invalid with denial-of-service attacks: – bad replies due to increased delays Solution in BFT: • Assumed by most Byzantine fault tolerance • provide safety without synchrony – guarantees no bad replies • assume eventual time bounds for liveness – may not reply with active denial-of-service attack – will reply when denial-of-service attack ends 10/30/2018 cs262a-F18 Lecture-19 27 10/30/2018 cs262a-F18 Lecture-19 28

8. clients Algorithm Properties Algorithm • Arbitrary replicated service State machine replication: – complex operations – deterministic replicas start in same state – mutable shared state – replicas execute same requests in same order replicas – correct replicas produce identical replies • Properties (safety and liveness): – system behaves as correct centralized service – clients eventually receive replies to requests client replicas • Assumptions: – 3f+1 replicas to tolerate f Byzantine faults (optimal) – strong cryptography – only for liveness: eventual time bounds • Hard: ensure requests execute in same order 10/30/2018 cs262a-F18 Lecture-19 29 10/30/2018 cs262a-F18 Lecture-19 30 Ordering Requests Rough Overview of Algorithm Primary-Backup: • A client sends a request for a service to the • View designates the primary replica primary client replicas primary backups view • Primary picks ordering • Backups ensure primary behaves correctly – certify correct ordering – trigger view changes to replace faulty primary client replicas primary backups 10/30/2018 cs262a-F18 Lecture-19 31 10/30/2018 cs262a-F18 Lecture-19 32

9. Rough Overview of Algorithm Rough Overview of Algorithm • A client sends a request for a service to the • A client sends a request for a service to the primary primary • The primary mulicasts the request to the backups • The primary mulicasts the request to the backups • Replicas execute request and sent a reply to the client client replicas client replicas primary backups primary backups 10/30/2018 cs262a-F18 Lecture-19 33 10/30/2018 cs262a-F18 Lecture-19 34 Rough Overview of Algorithm Quorums and Certificates • A client sends a request for a service to the primary quorums have at least 2f+1 replicas • The primary mulicasts the request to the backups • Replicas execute request and sent a reply to the quorum A quorum B client • The client waits for f+1 replies from different 3f+1 replicas replicas with the same result; this is the result of quorums intersect in at least one correct replica the operation f+1 matching • Certificate = set with messages from a quorum replies client primary backups replicas • Algorithm steps are justified by certificates 10/30/2018 cs262a-F18 Lecture-19 view 35 10/30/2018 cs262a-F18 Lecture-19 36

10. Algorithm Components Normal Case Operation • Normal case operation • Three phase algorithm: – pre-prepare picks order of requests • View changes – prepare ensures order within views • Garbage collection – commit ensures order across views • Recovery • Replicas remember messages in log • Messages are authenticated – •(k) denotes a message sent by k All have to be designed to work together 10/30/2018 cs262a-F18 Lecture-19 37 10/30/2018 cs262a-F18 Lecture-19 38 Pre-prepare Phase Prepare Phase digest of m multicast <PREPARE,v,n,D(m),1> assign sequence number n to request m in view v (1) m request : m pre-prepare prepare multicast <PRE-PREPARE,v,n,m>(0) replica 0 primary = replica 0 replica 1 replica 1 replica 2 replica 2 replica 3 fail fail replica 3 accepted PRE-PREPARE,v,n,m 0 backups accept pre-prepare if: all collect pre-prepare and 2f matching • in view v prepares • never accepted pre-prepare for v,n with different request P-certificate(m,v,n) 10/30/2018 cs262a-F18 Lecture-19 39 10/30/2018 cs262a-F18 Lecture-19 40

11. Order Within View Commit Phase multicast <COMMIT,v,n,D(m),2>(2) replies m No P-certificates with the same view and pre-prepare prepare commit sequence number and different requests replica 0 replica 1 If it were false: replica 2 replicas fail replica 3 replica has quorum for quorum for P-certificate(m,v,n) P-certificate(m,v,n) P-certificate(m’,v,n) all collect 2f+1 matching commits Request m executed after: C-certificate(m,v,n) one correct replica in common  m = m’ • having C-certificate(m,v,n) 10/30/2018 cs262a-F18 Lecture-19 41 • executing requests with 10/30/2018 sequence number less than n cs262a-F18 Lecture-19 42 View Changes View Change Protocol • Provide liveness when primary fails: – timeouts trigger view changes send P-certificates: <VIEW-CHANGE,v+1,P,2>(2) – select new primary ( view number mod 3f+1) fail replica 0 = primary 2f VC messages v • But also need to: replica 1= primary v+1 – preserve safety replica 2 – ensure replicas are in the same view long enough – prevent denial-of-service attacks replica 3 primary collects VC-messages in X: <NEW-VIEW,v+1,X,O> (1) pre-prepares messages for v+1 view in O with the same sequence number backups multicast prepare messages for pre-prepares in O 10/30/2018 cs262a-F18 Lecture-19 43 10/30/2018 cs262a-F18 Lecture-19 44

12. View Change Safety BFS: A Byzantine-Fault-Tolerant NFS replica 0 Goal: No C-certificates with the same snfsd replication sequence number and different requests library replication andrew library kernel VM • Intuition: if replica has C-certificate(m,v,n) then benchmark relay quorum for snfsd any quorum Q kernel NFS client C-certificate(m,v,n) replication library kernel VM replica n correct replica in Q has P-certificate(m,v,n) No synchronous writes – stability through replication 10/30/2018 cs262a-F18 Lecture-19 45 10/30/2018 cs262a-F18 Lecture-19 46 Andrew Benchmark BFS is Practical 70 70 Elapsed time (seconds) Elapsed time (seconds) 60 60 50 Configuration 50 Configuration 40 • 1 client, 4 replicas 40 • 1 client, 4 replicas 30 • Alpha 21064, 133 MHz 30 • Alpha 21064, 133 MHz 20 • Ethernet 10 Mbit/s 20 • Ethernet 10 Mbit/s • Andrew benchmark 10 10 0 0 BFS BFS-nr BFS NFS • BFS-nr is exactly like BFS but without replication • NFS is the Digital Unix NFS V2 implementation • 30 times worse with digital signatures 10/30/2018 cs262a-F18 Lecture-19 47 10/30/2018 cs262a-F18 Lecture-19 48

13. Is this a good paper? Closing Thoughts • What were the authors’ goals? • Attested Append Only Memory (A2M) • What about the evaluation/metrics? – Shows how secure hardware can reduce requirements for Byzantine Agreement to 2f+1 from 3f+1 • Did they convince you that this was a good • BlockChain mining system/approach? – Another solution to Byzantine Agreement? • Were there any red-flags? – Yes! • What mistakes did they make? • Does the system/approach meet the “Test of Time” challenge? • How would you review this paper today? 10/30/2018 cs262a-F18 Lecture-19 49 10/30/2018 cs262a-F18 Lecture-19 50