- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
PingCAP-Infra-Meetup-108-Deep Dive into TiDB DDL
展开查看详情
1 .Deep Dive into TiDB DDL Li Xia | PingCAP 2019/6 lixia@pingcap.com @紫沐夏_go https://github.com/zimulala
2 .Agenda ● Background ● Implementation details ○ Basic concepts ○ Process details ● DDL operations ○ Add index ○ General DDL operations ● Practicality ○ Accessibility ○ FAQ ● Q&A
3 .Part I - Background
4 .Background ● Each schema element has two states which we consider to be non-intermediate: absent and public. ● Schema S1 < S2, e.g. ○ S1: S2: ○ On TiDB server with S2: execute insert statement like “insert t values(2, '1');” ○ On TiDB server with S1: execute delete statement like “delete from t where id=2;” ○ Then the data index inconsistency will occur
5 .Intermediate state ● A definition of database consistency ○ No column values exist without a containing row and table. ○ All rows have all public required column values. ○ No index entries exist without a corresponding index in the schema. ○ All public indexes are complete. ○ All index entries point to valid rows. ○ All public constraints are honored. ○ No unknown values. ● A schema change from schema S1 to schema S2 is consistency preserving iff ○ any operation opS1 preserves the consistency of d with respect to schema S2 ○ any operation opS2 preserves the consistency of d with respect to schema S1. ● Intermediate states: delete-only, write-only
6 .Lease ● At the same time, there are at most two different versions of the schema for all nodes of the system. ● Every normal node in a lease will automatically load schema information. ● If this node cannot be loaded normally during the lease, SQL cannot be processed normally.
7 .Part II - Implementation details
8 .TiDB Architecture PD PD TSO/Data Location PD PD Cluster Metadata TiDB MySQL Clients TiKV TiKV TiDB DistSQL API ... TiDB TiKV TiKV MySQL Clients TiDB TiKV TiKV TiDB ... ... TiDB Cluster TiKV Cluster (Storage)
9 .Basic concepts ● Component ○ TiDB,update and load schema information ○ TiKV,store schema information and DDL jobs ○ PD,notify schema version change ● TiDB Role ○ Owner ○ Nonowners ● TiDB Module ○ Handle DDL jobs: worker ■ States ■ Jobs ○ Load schema ■ Lease
10 .Process introduction ● TiDB SQL Layer ● Process I: Parser AST Validator Plan Logical a. The MySQL Client sends a DDL request. SQL Builder Plan b. When a TiDB server receives a request, it will reach the TiDB SQL layer for execution. Physical MySQL Plan Clients Rows Executor TiDB Server Data TiKV TiKV TiKV TiKV TiKV TiKV TiKV Cluster
11 .Process introduction ● Perform DDL jobs ● Process II: a. Encapsulates the statement into a DDL job, and stores it into the KV layer. b. DDL owner handle DDL jobs. c. Check if the DDL job is finished d. Returns the response to the MySQL Client.
12 .Detailed steps ● State change process ○ Each DDL job requires multiple state changes. e.g. ■ create table: none -> public ■ add index: none -> delete only -> write only -> write reorg -> public ○ Update schema: ■ Update schema and a state ■ Update the latest schema version to PD ■ Wait for all TiDB servers to update to the latest schema information. ○ Loading schema: ■ Load the latest schema regularly ■ Load the latest schema when a TiDB receives a notification of the latest schema version update
13 .Detailed steps ● Parallel flow ○ Currently, only the add index operation in TiDB takes a long time to execute, so an add index operation may block other DDL operations. ○ That is, between different tables, the general operation and the add index operation can be performed in parallel. ○ Details: ■ The type of job ■ Dependency on jobs
14 .Detailed steps ● Campaign owner ○ path: /tidb/ddl/fg/owner ● Get corresponding jobs ● State change process ● Load schema
15 .Part I - DDL operations
16 .Add index ● Add index operation ○ State change process ○ The operation performed by the write reorg state is actually divided into two parts: ■ Read data from the corresponding column. ■ Backfill the data into the index. ○ The default priority is low
17 .Add index ● Backfill the data into the index tidb_ddl_reorg_worker_cnt Scope: GLOBAL Default:16
18 .tidb_ddl_reorg_batch-size = 128 Bentch Test add index Report 200W rows
19 .Add index tidb_ddl_reorg_batch_size Scope: GLOBAL Default:1024
20 .tidb_ddl_reorg_worker_cnt = 1 Bentch Test add index Report 200W rows
21 .DDL operations ● Add column operation ○ Only needs to change 5 states throughout the process. ● Modify column operation ○ Does not support lossy changes. ● Drop index/table/database operation ○ Only needs to change several states corresponding to throughout the process. ○ After a normal schema change, ■ We store the information about the deleted data into the gc_delete_range table ■ Process them until the GC stage.
22 .Part IV - Practicality
23 .Practicality ● Compatibility: https://github.com/pingcap/docs/blob/master/dev/reference/mysql-compatibility.md#ddl ○ Add Index ○ Add Column ○ Drop Column ○ Change/Modify Column ○ Table Partition
24 .Practicality ● Variables ○ Configure variables ■ run-ddl = true # Run ddl worker on this tidb-server. ■ lease = "45s" # Schema lease duration, very dangerous to change only if you know what you do. ○ Server variables ■ tidb_ddl_reorg_worker_cnt (default: 16) ■ tidb_ddl_reorg_batch_size (default: 1024) ○ HTTP variables ■ ddl_slow_threshold (default: 300 ms)
25 .Assessibility ● SQL:https://github.com/pingcap/docs/blob/master/dev/reference/sql/statements/admin.md#admin or https://github.com/pingcap/docs/blob/master/dev/faq/tidb.md#how-to-view-the-progress-of-the-current-ddl-job ○ ADMIN SHOW DDL tidb> admin show ddl\G; *************************** 1. row *************************** SCHEMA_VER: 140 OWNER: 1a1c4174-0fcd-4ba0-add9-12d08c4077dc OWNER_ADDRESS: :4000 RUNNING_JOBS: ID:121, Type:add index, State:running, SchemaState:write reorganization, SchemaID:1, TableID:118, RowCount:77312, ArgLen:0, start time: 2018-12-05 16:26:10.652 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:404749908941733890 SELF_ID: 1a1c4174-0fcd-4ba0-add9-12d08c4077dc ○ ADMIN SHOW DDL JOBS ○ ADMIN SHOW DDL JOB QUERIES job_id [, job_id] …
26 .Assessibility ● SQL ○ ADMIN CANCEL DDL JOBS 'job_id' [, 'job_id'] … ■ Some types are not supported ○ ADMIN RESTORE TABLE 'table_name' ○ ADMIN RESTORE TABLE BY JOB 'job_id'
27 .Assessibility ● API:https://github.com/pingcap/tidb/blob/master/docs/tidb_http_api.md curl http://127.0.0.1:10080/info/all ○ TiDB info { ■ curl http://{TiDBIP}:10080/info/all "all_servers_info": { "30fdb2f3-d33c-4af0-9d15-75b128f8ae5b": { ■ curl http://{TiDBIP}:10080/info "ddl_id": "30fdb2f3-d33c-4af0-9d15-75b128f8ae5b", ○ Schema info "git_hash": "39ab60557822cba1d48e0ead6ebc8d619e10fe24", ■ curl http://{TiDBIP}:10080/schema "ip": "", ■ curl http://{TiDBIP}:10080/schema/{db} "lease": "45s", "listening_port": 4000, ■ curl http://{TiDBIP}:10080/schema/{db}/{table} "status_port": 10080, ■ curl http://{TiDBIP}:10080/schema?table_id={tableID} "version": "5.7.10-TiDB-v2.1.0-rc.3-324-g39ab60557-dirty" ○ DDL History } }, ■ curl http://{TiDBIP}:10080/ddl/history "is_all_server_version_consistent": true, # If it's false,return the details ■ curl http://{TiDBIP}:10080/ddl/history?limit={number} "owner_id": "30fdb2f3-d33c-4af0-9d15-75b128f8ae5b", "servers_num": 1 }
28 .FAQ ● Time spent executing DDL under normal conditions ○ Add index operation ■ depends on the number of data rows in the table and the system load ■ empty table:3s ○ General DDL operation ■ most cases are less than 1s
29 .FAQ ● In some cases, running DDL statements is slow ○ If you run multiple DDL statements together ○ After you start the cluster successfully, the first DDL operation ○ Stopping TiDB due to kill -9 command or power failure ○ If there is a communication problem between TiDB and PD