一致性与扩展性

在大数据时代,分布式是不可避免的,然而根据CAP理论,如何在可用性保证基础上,获得一致性呢?本文从源头解释水平扩展必备的前提下,一致性问题解决的基本方法,从单机的数据存储到分布式的两阶段提交,用很生动的案例来对这些问题的来源进行阐述,比如锁机制,故障恢复等,并提出现有的解决办法。
展开查看详情

1.Consistency and Scalability Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah COMP 150-IDS: Internet Scale Distributed Systems (Spring 2017)

2.2 What you should get from today’s session You will explore challenges relating to maintaining data consistency in a computing system You will learn about techniques used to make storage systems more reliable You will learn about transactions and their implementation using logs You will learn about the CAP theorem and why scaling and consistency tend not to come together

3.A note about scope The challenges & principles we cover today reappear at every level of system design CPU Instruction set and memory Parallel programming languages Single machine databases Distributed applications and databases Today we will focus mainly on larger scale systems 3

4.4 Why Worry About Consistency?

5.Duplicate information in computing systems Why complicated things? Mirrored disks for reliability Parallel processing  higher throughput Geographic distribution reduces network delay (one each in Europe, Asia, US) Higher availability  if network crashes, each “partition” may still have a copy Inter-dependent data Bank account records have total for each account Bank record keeps total for all accounts Memory Hierarchies CPU Caches, file system caches, Web proxies, etc. 5 If we allow updates, then maintaining consistency is tricky

6.6 Simple Examples: Parallel Disk Systems

7.Mirrored disks 7 Logical disk Mirrored Implementation X X X Everything written twice Better performance on reads (slower on writes)

8.Duplicate data and crash recovery 8 Logical disk Mirrored Implementation X X X After a crash, data survives Crash!

9.Mirrored disks 9 Logical disk Mirrored Implementation X X X Replacement drive can be reconstructed in the background

10.Unix Kernel REVIEW: How is the disk used in Unix / Linux? Sector Application Access by cylinder/track/sector Filesystem Files/Dirs security, etc Buffered block r/w: hides timing Sector In-memory Block Cache Block Device Driver Direct read/write of filesystem “blocks” (hides sector size and device geometry) Raw Device Driver

11.Unix Kernel We can use mirrored disks with Unix Application Filesystem Files/Dirs security, etc Buffered block r/w: hides timing Sector In-memory Block Cache Block Device Driver MIRRORED Device Driver Mirrored Implementation Abstraction: The mirrored disk provides the same service as a single disk…just faster and more reliable!

12.Atomicity and update synchronziation 12 Logical disk Mirrored Implementation X X X Mirrored writes DO NOT happen at quite the same time Question: when is the update committed?

13.Logical disk RAID – Reliable Arrays of Inexpensive Disks 13 X X X X RAID Implementation

14.RAID – Reliable Arrays of Inexpensive Disks 14 RAID Implementation Y X X Y X X XOR(X,Y) Logical disk

15.RAID – Reliable Arrays of Inexpensive Disks 15 RAID Implementation Y X X Y X XOR(X,Y,Z) Z Z Much less space overhead than mirroring…but typically slower Logical disk

16.RAID – Reliable Arrays of Inexpensive Disks 16 RAID Implementation Y X X Y X XOR(X,Y,Z) Z Z Crash! If any disk is lost…you can reconstruct from information on the others! Logical disk

17.17 Why Consistency is Hard

18.Synchronization problem 18 NA =Access Noah’s Bank account Bal = NA.Balance ; NewBalance = Bal + $1000 NA.Balance.Write NewBalance Some code to add money to my account NA =Access Noah’s Bank account Bal = NA.Balance ; NewBalance = Bal + $1000 NA.Balance.Write NewBalance Some code to add money to my account Let’s run code for two deposits in parallel Can you see the problem? There’s a risk that both copies will pick up X before either updates. If that happens, I only get $1000 not $2000! 

19.Solution - locking 19 Lock Noah’s Bank Account NA =Access Noah’s Bank account Bal = NA.Balance ; NewBalance = Bal + $1000 NA.Balance.Write NewBalance Unlock Noah’s Bank Account Some code to add money to my account Now the two copies can’t run at once on the same account …but if each locks a different bank account they can. Only one transaction or thread can hold the lock at a time

20.Consistency and Crash R ecovery 20 NA =Access Noah’s Bank account YA =Access Your Bank account NBal = NA.Balance ; Ybal = YA.Balance ; Nbal += $1000; Ybal -= $1000; NA.Balance.Write Nbal YA.Balance.Write Ybal Some code to transfer money Can you see the problem? If the system crashes just after writing my balance, the bank loses $1000 (it’s still in your account too) This gets lost during crash

21.21 Transactions

22.Transactions: automated consistency & crash recovery! 22 BEGIN_TRANSACTION NA =Access Noah’s Bank account YA =Access Your Bank account NBal = NA.Balance ; Ybal = YA.Balance ; Nbal += $1000; Ybal -= $1000; NA.Balance . Write Nbal YA.Balance.Write Ybal END_TRANSACTION Some code to transfer money The system guarantees that either everything in the transaction happens, or nothing…and it guarantees more!

23.ACID Properties of a Transaction Atomicity Everything happens or nothing Consistency If the database has rules they are obeyed at transaction end (e.g. balance must be < $1,000,000) Isolation Any two parallel transactions act as if serial Most transaction systems do the locking automatically! Durability Once committed, never lost 23 That seems almost magic…how can we achieve all this?

24.How to implement transactions - logging The key idea: a shared log records information needed to undo any change made by any transaction When a transaction commits : All data is written to the main data store A commit record is written to the log. This is the atomic point at which the transaction “happens” After a crash, the log is “replayed” For any transactions that did not commit, the undo operations are performed After the crash, only commited operations have happened! When combined with transaction driven locking, we can automatically support ACID properties with almost no application code complexity This is all built into SQL databases like Oracle, Postgres , DB2, and SQL Server 24 Logging and transaction processing are two of the most important and beautiful data processing technologies

25.Logging in Action 25 BEGIN_TRANSACTION NA =Access Noah’s Bank account YA =Access Your Bank account NBal = NA.Balance ; Ybal = YA.Balance ; Nbal += $1000; Ybal -= $1000; NA.Balance.Write Nbal YA.Balance.Write Ybal END_TRANSACTION Some code to transfer money Noah.Bal = $100 Your.Bal = $1300

26.Logging in Action 26 BEGIN_TRANSACTION NA =Access Noah’s Bank account YA =Access Your Bank account NBal = NA.Balance ; Ybal = YA.Balance ; Nbal += $1000; Ybal -= $1000; NA.Balance.Write Nbal YA.Balance.Write Ybal END_TRANSACTION Some code to transfer money Noah.Bal = $100 Your.Bal = $1300 Begin Tr ans 1 Log

27.Logging in Action 27 BEGIN_TRANSACTION NA =Access Noah’s Bank account YA =Access Your Bank account NBal = NA.Balance ; Ybal = YA.Balance ; Nbal += $1000; Ybal -= $1000; NA.Balance.Write Nbal YA.Balance.Write Ybal END_TRANSACTION Some code to transfer money Noah.Bal = $1100 Your.Bal = $1300 Begin Tr ans 1 Log Old Noah Bal = $100

28.Logging in Action 28 BEGIN_TRANSACTION NA =Access Noah’s Bank account YA =Access Your Bank account NBal = NA.Balance ; Ybal = YA.Balance ; Nbal += $1000; Ybal -= $1000; NA.Write Nbal YA.Balance.Write Ybal END_TRANSACTION Some code to transfer money Noah.Bal = $1100 Your.Bal = $1300 Begin Tr ans 1 Log Old Noah Bal = $100 Old Your Bal = $1300

29.Logging in Action 29 BEGIN_TRANSACTION NA =Access Noah’s Bank account YA =Access Your Bank account NBal = NA.Balance ; Ybal = YA.Balance ; Nbal += $1000; Ybal -= $1000; NA.Write Nbal YA.Write Ybal END_TRANSACTION Some code to transfer money Noah.Bal = $1100 Your.Bal = $300 Begin Tr ans 1 Log Old Noah Bal = $100 Old Your Bal = $1300 Commit Tr 1