1. 分布式事务的定义
2. Percolator 中事务的实现
3. TiDB 中事务的实现及注意事项

首先,在分布式事务的定义中,主要介绍了以下内容:
1. ACID
2. 四种常见隔离级别

然后在解读 Percolator 中事务实现的过程中,核心内容如下:
1. 基于快照隔离级别的优缺点
2. 如何通过两阶段提交实现跨行跨表的分布式事务。

最后,我们详细介绍了 TiDB 中分布式事务的实现,内容包括:
1. TiDB 如何将关系型数据转化成 key-value 存储。
2. TiDB 中两阶段提交的实现细节及异常处理。
3. TiDB 事务使用过程中的注意事项。

TiDB发布于2019/04/03

注脚

展开查看详情

1. Welcome! 加入 Infra Meetup No.91 交流群 和大家一起讨论吧~ 我们将分享本期 Meetup 资料

2. Head First Distributed Transaction in TiDB Presented by wuxuelian

3.Agenda ● ACID ● ISOLATION LEVEL ● Percolator ● Transaction in TiDB

4.Part I - ACID

5.ACID $7 $7 Bob: $10 Joe: $2 Bob: $10 Joe: $2 Success Failed Bob: $3 Joe: $9 Bob: $10 Joe: $2 Bob: $10 Joe: $9 Bob: $3 Joe: $2

6.ACID ● Atomicity ○ Each transaction is treated as a single "unit", which either succeeds completely, or fails completely ● Consistency ○ Any data written to the database must be valid according to all defined rules. ● Isolation ○ Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially ● Durability ○ Once a transaction has been committed, it will remain committed even in the case of a system failure

7.Part II - Isolation Levels

8.Read uncommitted Session A Session B begin; select account from account where id = 1 // will get 1000 begin; update account set account=account+500 where id = 1 // not commit here select account from account where id = 1 // will get 1500 (Dirty read) rollback;

9.Read committed Session A Session B begin; select account from account where id=1; // get 1000 begin; update account set account = account+500 where id=1; commit; select account from account where id = 1; // get 1500 (Non-repeatable reads) commit;

10.Repeatable read Session A Session B begin; select account from account where id=1; // get 1000 begin; update account set account = account+500 where id=1; commit; select account from account where id = 1; // get 1000 commit;

11.Repeatable read Session A Session B begin; select id from account; // get id(1), id(2) begin; insert into account values(3,"Dada",5000); commit; select id from account; // get id(1), id(2) insert into account values(3,"Dada",5000); // ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY' (Phantom reads)

12.Serializable Session: A Session: B

13.Summary

14.Part III - Percolator

15.Snapshot Isolation Time 1 2 3 ● Read: read from a stable snapshot at some timestamp ● Write: protects against write-write conflicts.

16.2 Phase Commit Bob have $10, Joe have $2, Bob will give Joe $7. key data lock write Bob 5: $10 6: data @5 Joe 5: $2 6: data @5

17.Phase#1 : Prewrite key data lock write Bob 5: $10 6: data @5 7:$3 7:I’m primary Joe 5: $2 6: data @5 7:$9 7:primary @ Bob

18.Phase#2: Primary Commit (Sync) Bob have $3, Joe have $9 now. key data lock write Bob 5: $10 6: data @5 7:$3 7: I’m primary 8: data @7 Joe 5: $2 6: data @5 7:$9 7: primary @ Bob

19.Phase#2: Secondary Commit (Async) Bob have $3, Joe have $9 now. key data lock write Bob 5: $10 6: data @5 7:$3 7: I’m primary 8: data @7 Joe 5: $2 6: data @5 7:$9 7: primary @ Bob 8: data @7

20.Summary ● Advantage ○ Simple ○ Implement cross-row transaction based on single-row transaction (BigTable) ○ Decentralized lock management ● Disadvantage ○ Centralized timestamp oracle. ○ More RPC

21.Part IV - Transaction in TiDB

22.Architecture TiDB ... TiDB ... TiDB Metadata / Timestamp request Placement Driver (PD) Raft groups Region 1 Region 1 Region 2 Region 1 Region 2 Region 2 Region 3 Region 3 Control flow: Balance / Failover Region 3 ... ... ... ... tikv1 tikv2 tikv3 tikv4 PingCAP.com

23.How to convert from SQL to Key-Value id (primary) name(unique) age(non-unique) score 1 Bob 12 99 SQL Model index_type key value primary_index 1 (Bob, 12, 99) name(unique) Bob 1 age(non-unique) (12,1) null Key-Value Model

24.Column Families in RocksDB Column Family Key Value Data key, start_ts value Lock key start_ts, primary_key, ttl Write key, commit_ts start_ts [, short_value] ● Start_ts: timestamp when the transaction begins ● Commit_ts: timestamp get after prewrite, use in commit. ● Primary_key: key used to store the status of transaction. ● Short_value: value which is short.(with length<64 byte)

25.2 PC in TiDB

26. Prewrite Errors: ● WriteConflict (newer version exist) ● KeyIsLocked

27. Commit Errors: ● Lock Not Found

28.Attentions for Using Optimistic Lock session 1 session 2 begin; begin; select balance from T where id = 1; update T set balance=balance - 100 where id =1; // use the result of select update T set balance=balance - 100 where id = 2; if balance > 100 { update T set balance = balance + 100 where id = 2; } commit; // auto retry commit; Set @@global.tidb_disable_txn_auto_retry = 1

29.Attentions for large transaction Due to the distributed, 2-phase commit requirement of TiDB, large transactions that modify data can be particularly problematic: ● Long duration ● More conflicts ● And so on ... TiDB intentionally sets some limits on transaction sizes to reduce this impact: ● Each Key-Value entry is no more than 6MB ● The total number of Key-Value entries is no more than 300,000 ● The total size of Key-Value entries is no more than 100MB

user picture
TiDB 是一款定位于在线事务处理/在线分析处理( HTAP: Hybrid Transactional/Analytical Processing)的融合型数据库产品,实现了一键水平伸缩,强一致性的多副本数据安全,分布式事务,实时 OLAP 等重要特性。

相关文档