- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
PingCAP-Infra-Meetup-91-Distributed-Transaction+in+TiDB
展开查看详情
1 . Head First Distributed Transaction in TiDB Presented by wuxuelian
2 .Agenda ● ACID ● ISOLATION LEVEL ● Percolator ● Transaction in TiDB
3 .Part I - ACID
4 .ACID $7 $7 Bob: $10 Joe: $2 Bob: $10 Joe: $2 Success Failed Bob: $3 Joe: $9 Bob: $10 Joe: $2 Bob: $10 Joe: $9 Bob: $3 Joe: $2
5 .ACID ● Atomicity ○ Each transaction is treated as a single "unit", which either succeeds completely, or fails completely ● Consistency ○ Any data written to the database must be valid according to all defined rules. ● Isolation ○ Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially ● Durability ○ Once a transaction has been committed, it will remain committed even in the case of a system failure
6 .Part II - Isolation Levels
7 .Read uncommitted Session A Session B begin; select account from account where id = 1 // will get 1000 begin; update account set account=account+500 where id = 1 // not commit here select account from account where id = 1 // will get 1500 (Dirty read) rollback;
8 .Read committed Session A Session B begin; select account from account where id=1; // get 1000 begin; update account set account = account+500 where id=1; commit; select account from account where id = 1; // get 1500 (Non-repeatable reads) commit;
9 .Repeatable read Session A Session B begin; select account from account where id=1; // get 1000 begin; update account set account = account+500 where id=1; commit; select account from account where id = 1; // get 1000 commit;
10 .Repeatable read Session A Session B begin; select id from account; // get id(1), id(2) begin; insert into account values(3,"Dada",5000); commit; select id from account; // get id(1), id(2) insert into account values(3,"Dada",5000); // ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY' (Phantom reads)
11 .Serializable Session: A Session: B
12 .Summary
13 .Part III - Percolator
14 .Snapshot Isolation Time 1 2 3 ● Read: read from a stable snapshot at some timestamp ● Write: protects against write-write conflicts.
15 .2 Phase Commit Bob have $10, Joe have $2, Bob will give Joe $7. key data lock write Bob 5: $10 6: data @5 Joe 5: $2 6: data @5
16 .Phase#1 : Prewrite key data lock write Bob 5: $10 6: data @5 7:$3 7:I’m primary Joe 5: $2 6: data @5 7:$9 7:primary @ Bob
17 .Phase#2: Primary Commit (Sync) Bob have $3, Joe have $9 now. key data lock write Bob 5: $10 6: data @5 7:$3 7: I’m primary 8: data @7 Joe 5: $2 6: data @5 7:$9 7: primary @ Bob
18 .Phase#2: Secondary Commit (Async) Bob have $3, Joe have $9 now. key data lock write Bob 5: $10 6: data @5 7:$3 7: I’m primary 8: data @7 Joe 5: $2 6: data @5 7:$9 7: primary @ Bob 8: data @7
19 .Summary ● Advantage ○ Simple ○ Implement cross-row transaction based on single-row transaction (BigTable) ○ Decentralized lock management ● Disadvantage ○ Centralized timestamp oracle. ○ More RPC
20 .Part IV - Transaction in TiDB
21 .Architecture TiDB ... TiDB ... TiDB Metadata / Timestamp request Placement Driver (PD) Raft groups Region 1 Region 1 Region 2 Region 1 Region 2 Region 2 Region 3 Region 3 Control flow: Balance / Failover Region 3 ... ... ... ... tikv1 tikv2 tikv3 tikv4 PingCAP.com
22 .How to convert from SQL to Key-Value id (primary) name(unique) age(non-unique) score 1 Bob 12 99 SQL Model index_type key value primary_index 1 (Bob, 12, 99) name(unique) Bob 1 age(non-unique) (12,1) null Key-Value Model
23 .Column Families in RocksDB Column Family Key Value Data key, start_ts value Lock key start_ts, primary_key, ttl Write key, commit_ts start_ts [, short_value] ● Start_ts: timestamp when the transaction begins ● Commit_ts: timestamp get after prewrite, use in commit. ● Primary_key: key used to store the status of transaction. ● Short_value: value which is short.(with length<64 byte)
24 .2 PC in TiDB
25 . Prewrite Errors: ● WriteConflict (newer version exist) ● KeyIsLocked
26 . Commit Errors: ● Lock Not Found
27 .Attentions for Using Optimistic Lock session 1 session 2 begin; begin; select balance from T where id = 1; update T set balance=balance - 100 where id =1; // use the result of select update T set balance=balance - 100 where id = 2; if balance > 100 { update T set balance = balance + 100 where id = 2; } commit; // auto retry commit; Set @@global.tidb_disable_txn_auto_retry = 1
28 .Attentions for large transaction Due to the distributed, 2-phase commit requirement of TiDB, large transactions that modify data can be particularly problematic: ● Long duration ● More conflicts ● And so on ... TiDB intentionally sets some limits on transaction sizes to reduce this impact: ● Each Key-Value entry is no more than 6MB ● The total number of Key-Value entries is no more than 300,000 ● The total size of Key-Value entries is no more than 100MB
29 .Attentions for small transaction # original version with auto_commit UPDATE my_table SET a='new_value' WHERE id = 1; UPDATE my_table SET a='newer_value' WHERE id = 2; UPDATE my_table SET a='newest_value' WHERE id = 3; # improved version START TRANSACTION; UPDATE my_table SET a='new_value' WHERE id = 1; UPDATE my_table SET a='newer_value' WHERE id = 2; UPDATE my_table SET a='newest_value' WHERE id = 3; COMMIT;