Lock Granularity and Consistency

本章节主要介绍了计算机的锁粒度以及一致性水平,给出了锁粒度的定义,数据库引擎通常必须获取多粒度级别上的锁才能完整地保护资源;另外还介绍了二阶段提交协议,两阶段提交协议把分布式事务分成两个过程,一个是准备阶段,一个是提交阶段,准备阶段和提交阶段都是由事务管理器发起的。
展开查看详情

1.Lock Granularity and Consistency Levels (Lecture 7 , cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley February 7, 2018

2.Lock Granularity and Consistency Levels (Lecture 7 , cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley February 7, 2018

3.The ACID properties of Transactions Atomicity: all actions in the transaction happen, or none happen Consistency: if each transaction is consistent, and the database starts consistent, it ends up consistent, e.g., Balance cannot be negative Cannot reschedule meeting on February 30 Isolation : execution of one transaction is isolated from others Durability: if a transaction commits, its effects persist

4.Example : Transaction 101 UPDATE accounts SET balance = balance - 100.00 WHERE name = Alice; UPDATE branches SET balance = balance - 100.00 WHERE name = (SELECT branch_name FROM accounts WHERE name = Alice); UPDATE accounts SET balance = balance + 100.00 WHERE name = Bob; UPDATE branches SET balance = balance + 100.00 WHERE name = (SELECT branch_name FROM accounts WHERE name = Bob); BEGIN; --BEGIN TRANSACTION COMMIT; --COMMIT WORK Transfer $100 from Alice ’ s account to Bob’s account

5.Why is it Hard? Failures: might leave state inconsistent or cause updates to be lost Remember last lecture? Concurrency: might leave state inconsistent or cause updates to be lost This lecture and the next one!

6.Concurrency When operations of concurrent threads are interleaved, the effect on shared state can be unexpected Well known issue in operating systems, thread programming Critical section in OSes Java use of synchronized keyword

7.Transaction Scheduling Why not run only one transaction at a time ? Answer: low system utilization Two transactions cannot run simultaneously even if they access different data Goal of transaction scheduling: Maximize system utilization, i.e., concurrency Interleave operations from different transactions Preserve transaction semantics Logically all operations in a transaction are executed atomically Intermediate state of a transaction is not visible to other transactions

8.Anomalies with Interleaved Execution May violate transaction semantics, e.g., some data read by the transaction changes before committing Inconsistent database state, e.g., some updates are lost Anomalies always involves a “ write ” ; Why?

9.P0 – Overwriting uncommitted data Write-write conflict T2 writes value modified by T1 before T1 commits, e.g , T2 overwrites W(A) before T1 commits Violates transaction serializability If transactions were serial, you ’ d get either: T1 ’ s updates of A and B T2 ’ s updates of A and B T1:W(A), W(B) T2: W(A),W(B)

10.P1 – Reading uncommitted data (dirty read) Write-read conflict (reading uncommitted data or dirty read) T2 reads value modified by T1 before T1 commits, e.g., T2 reads A before T1 modifies it T1:R(A),W(A), T2: R(A), …

11.P3 – Non-repeatable reads Read-Write conflict T2 reads value, after which T1 modifies it, e.g., T2 reads A, after which T1 modifies it Example : Mary and John want to buy a TV set on Amazon but there is only one left in stock (T1) John logs first, but waits… (T2) Mary logs second and buys the TV set right away (T1) John decides to buy, but it is too late… T1: R(A),W(A ) T2:R(A ), R(A),W(A )

12.Goals of Transaction Scheduling Maximize system utilization, i.e., concurrency Interleave operations from different transactions Preserve transaction semantics Semantically equivalent to a serial schedule, i.e., one transaction runs at a time T1: R, W, R, W T2: R, W, R, R, W R, W, R, W, R, W, R, R, W Serial schedule (T1, then T2): R, W, R, R, W, R, W, R, W Serial schedule (T2, then T1):

13.Two Key Questions Is a given schedule equivalent to a serial execution of transactions? How do you come up with a schedule equivalent to a serial schedule? R, W, R, W, R, W, R, R, W R, W, R, R, W, R, W, R, W R, R, W, W, R, R, R, W, W Schedule: Serial schedule (T1, then T2): : Serial schedule (T2, then T1):

14.Transaction Scheduling Serial schedule : A schedule that does not interleave the operations of different transactions Transactions run serially (one at a time ) Equivalent schedules: For any storage/database state, the effect (on storage/database) and output of executing the first schedule is identical to the effect of executing the second schedule Serializable schedule: A schedule that is equivalent to some serial execution of the transactions Intuitively: with a serializable schedule you only see things that could happen in situations where you were running transactions one-at-a-time

15.Conflict Serializable Schedules Two operations conflict if they Belong to different transactions Are on the same data At least one of them is a write Two schedules are conflict equivalent iff : Involve same operations of same transactions Every pair of conflicting operations is ordered the same way Schedule S is conflict serializable if S is conflict equivalent to some serial schedule

16.Conflict Equivalence – Intuition If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable , e.g., T1:R(A),W(A), R(B),W(B) T2: R(A),W(A), R(B),W(B) T1:R(A),W(A), R(B), W(B) T2: R(A), W(A), R(B),W(B) T1:R(A),W(A),R(B), W(B) T2: R(A),W(A), R(B),W(B)

17.Conflict Equivalence – Intuition If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable , e.g., T1:R(A),W(A),R(B), W(B) T2: R(A),W(A), R(B),W(B) T1:R(A),W(A),R(B), W(B) T2: R(A), W(A),R(B),W(B) T1:R(A),W(A),R(B),W(B) T2: R(A), W(A),R(B),W(B)

18.Conflict Equivalence – Intuition If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable , e.g., Is this schedule serializable ? T1:R(A), W(A) T2: R(A),W(A),

19.Dependency Graph Dependency graph: Transactions represented as nodes Edge from Ti to Tj : an operation of Ti conflicts with an operation of Tj Ti appears earlier than Tj in the schedule Theorem: Schedule is conflict serializable if and only if its dependency graph is acyclic

20.Example Conflict serializable schedule: No cycle! T1 T2 A Dependency graph B T1:R(A),W(A), R(B),W(B) T2: R(A),W(A), R(B),W(B)

21.Example Conflict that is not serializable : Cycle: The output of T1 depends on T2, and vice-versa T1:R(A),W(A), R(B),W(B) T2: R(A),W(A),R(B),W(B) T1 T2 A B Dependency graph

22.Notes on Conflict Serializability Conflict Serializability doesn’ t allow all schedules that you would consider correct This is because it is strictly syntactic - it doesn’ t consider the meanings of the operations or the data Many times, Conflict Serializability is what gets used, because it can be done efficiently See isolation degrees/levels next Two -phase locking (2PL) is how we implement it

23.T1:R(A), W(A), T2: W(A), T3: WA Srializability ≠ Conflict Serializability Following schedule is not conflict serializable However, the schedule is serializable since its output is equivalent with the following serial schedule Note: d eciding whether a schedule is serializable (not conflict- serializable ) is NP-complete T1:R(A),W(A), T2: W(A), T3: WA T1 T2 A Dependency graph T3 A A A

24.Locks “ Locks ” to control access to data Two types of locks: shared (S) lock: multiple concurrent transactions allowed to operate on data exclusive (X) lock: only one transaction can operate on data at a time Lock Compatibility Matrix Held\Request S X S Yes Block X Block Block

25.Two-Phase Locking (2PL) 1) Each transaction must obtain: S ( shared ) or X ( exclusive ) lock on data before reading, X ( exclusive ) lock on data before writing 2) A transaction can not request additional locks once it releases any locks Thus, each transaction has a “ growing phase ” followed by a “ shrinking phase ” Growing Phase Shrinking Phase Lock Point!

26.Two-Phase Locking (2PL) 2PL guarantees conflict serializability Doesn’ t allow dependency cycles. Why ? Answer: a dependency cycle leads to deadlock Assume there is a cycle between Ti and Tj Edge from Ti to Tj : Ti acquires lock first and Tj needs to wait Edge from Tj to Ti: Tj acquires lock first and Ti needs to wait Thus, both Ti and Tj wait for each other Since with 2PL neither Ti nor Tj release locks before acquiring all locks they need  deadlock Schedule of conflicting transactions is conflict equivalent to a serial schedule ordered by “ lock point ”

27.Example T1 transfers $50 from account A to account B T2 outputs the total of accounts A and B Initially, A = $1000 and B = $2000 What are the possible output values? T1:Read(A),A:=A-50,Write(A),Read(B),B:=B+50,Write(B) T2:Read(A),Read(B),PRINT(A+B)

28.Is this a 2PL Schedule? 1 Lock_X (A) <granted> 2 Read(A) Lock_S(A) 3 A: = A-50 4 Write(A) 5 Unlock(A) <granted> 6 Read(A) 7 Unlock(A) 8 Lock_S(B) <granted> 9 Lock_X(B) 10 Read(B) 11 <granted> Unlock(B) 12 PRINT(A+B) 13 Read(B) 14 B := B +50 15 Write(B) 16 Unlock(B) No, and it is not serializable

29.Is this a 2PL Schedule? 1 Lock_X (A) <granted> 2 Read(A ) Lock_S(A) 3 A: = A-50 4 Write(A) 5 Lock_X (B) <granted> 6 Unlock(A) <granted> 7 Read(A) 8 Lock_S(B) 9 Read(B) 10 B := B +50 11 Write(B) 12 Unlock(B) <granted> 13 Unlock(A) 14 Read(B) 15 Unlock(B) 16 PRINT(A+B) Yes , it is serializable