PingCAP-Infra-Meetup-108-Deep Dive into TiDB DDL

李霞老师本次分享的主要内容包括: •F1 online asynchronous DDL 的原理。 •TiDB DDL 的基本框架,详细讲述其实现。 •Add index 的具体流程,并简介了其他几种 DDL 语句实现特性。 •DDL 的一些辅助功能。 最后结合之前介绍的原理和实现,讲解了常见的几个问题以及解决方式。

1.Deep Dive into TiDB DDL Li Xia | PingCAP 2019/6 @紫沐夏_go

2.Agenda ● Background ● Implementation details ○ Basic concepts ○ Process details ● DDL operations ○ Add index ○ General DDL operations ● Practicality ○ Accessibility ○ FAQ ● Q&A

3.Part I - Background

4.Background ● Each schema element has two states which we consider to be non-intermediate: absent and public. ● Schema S1 < S2, e.g. ○ S1: S2: ○ On TiDB server with S2: execute insert statement like “insert t values(2, '1');” ○ On TiDB server with S1: execute delete statement like “delete from t where id=2;” ○ Then the data index inconsistency will occur

5.Intermediate state ● A definition of database consistency ○ No column values exist without a containing row and table. ○ All rows have all public required column values. ○ No index entries exist without a corresponding index in the schema. ○ All public indexes are complete. ○ All index entries point to valid rows. ○ All public constraints are honored. ○ No unknown values. ● A schema change from schema S1 to schema S2 is consistency preserving iff ○ any operation opS1 preserves the consistency of d with respect to schema S2 ○ any operation opS2 preserves the consistency of d with respect to schema S1. ● Intermediate states: delete-only, write-only

6.Lease ● At the same time, there are at most two different versions of the schema for all nodes of the system. ● Every normal node in a lease will automatically load schema information. ● If this node cannot be loaded normally during the lease, SQL cannot be processed normally.

7.Part II - Implementation details

8.TiDB Architecture PD PD TSO/Data Location PD PD Cluster Metadata TiDB MySQL Clients TiKV TiKV TiDB DistSQL API ... TiDB TiKV TiKV MySQL Clients TiDB TiKV TiKV TiDB ... ... TiDB Cluster TiKV Cluster (Storage)

9.Basic concepts ● Component ○ TiDB,update and load schema information ○ TiKV,store schema information and DDL jobs ○ PD,notify schema version change ● TiDB Role ○ Owner ○ Nonowners ● TiDB Module ○ Handle DDL jobs: worker ■ States ■ Jobs ○ Load schema ■ Lease

10.Process introduction ● TiDB SQL Layer ● Process I: Parser AST Validator Plan Logical a. The MySQL Client sends a DDL request. SQL Builder Plan b. When a TiDB server receives a request, it will reach the TiDB SQL layer for execution. Physical MySQL Plan Clients Rows Executor TiDB Server Data TiKV TiKV TiKV TiKV TiKV TiKV TiKV Cluster

11.Process introduction ● Perform DDL jobs ● Process II: a. Encapsulates the statement into a DDL job, and stores it into the KV layer. b. DDL owner handle DDL jobs. c. Check if the DDL job is finished d. Returns the response to the MySQL Client.

12.Detailed steps ● State change process ○ Each DDL job requires multiple state changes. e.g. ■ create table: none -> public ■ add index: none -> delete only -> write only -> write reorg -> public ○ Update schema: ■ Update schema and a state ■ Update the latest schema version to PD ■ Wait for all TiDB servers to update to the latest schema information. ○ Loading schema: ■ Load the latest schema regularly ■ Load the latest schema when a TiDB receives a notification of the latest schema version update

13.Detailed steps ● Parallel flow ○ Currently, only the add index operation in TiDB takes a long time to execute, so an add index operation may block other DDL operations. ○ That is, between different tables, the general operation and the add index operation can be performed in parallel. ○ Details: ■ The type of job ■ Dependency on jobs

14.Detailed steps ● Campaign owner ○ path: /tidb/ddl/fg/owner ● Get corresponding jobs ● State change process ● Load schema

15.Part I - DDL operations

16.Add index ● Add index operation ○ State change process ○ The operation performed by the write reorg state is actually divided into two parts: ■ Read data from the corresponding column. ■ Backfill the data into the index. ○ The default priority is low

17.Add index ● Backfill the data into the index tidb_ddl_reorg_worker_cnt Scope: GLOBAL Default:16

18.tidb_ddl_reorg_batch-size = 128 Bentch Test add index Report 200W rows

19.Add index tidb_ddl_reorg_batch_size Scope: GLOBAL Default:1024

20.tidb_ddl_reorg_worker_cnt = 1 Bentch Test add index Report 200W rows

21.DDL operations ● Add column operation ○ Only needs to change 5 states throughout the process. ● Modify column operation ○ Does not support lossy changes. ● Drop index/table/database operation ○ Only needs to change several states corresponding to throughout the process. ○ After a normal schema change, ■ We store the information about the deleted data into the gc_delete_range table ■ Process them until the GC stage.

22.Part IV - Practicality

23.Practicality ● Compatibility: ○ Add Index ○ Add Column ○ Drop Column ○ Change/Modify Column ○ Table Partition

24.Practicality ● Variables ○ Configure variables ■ run-ddl = true # Run ddl worker on this tidb-server. ■ lease = "45s" # Schema lease duration, very dangerous to change only if you know what you do. ○ Server variables ■ tidb_ddl_reorg_worker_cnt (default: 16) ■ tidb_ddl_reorg_batch_size (default: 1024) ○ HTTP variables ■ ddl_slow_threshold (default: 300 ms)

25.Assessibility ● SQL: or ○ ADMIN SHOW DDL tidb> admin show ddl\G; *************************** 1. row *************************** SCHEMA_VER: 140 OWNER: 1a1c4174-0fcd-4ba0-add9-12d08c4077dc OWNER_ADDRESS: :4000 RUNNING_JOBS: ID:121, Type:add index, State:running, SchemaState:write reorganization, SchemaID:1, TableID:118, RowCount:77312, ArgLen:0, start time: 2018-12-05 16:26:10.652 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:404749908941733890 SELF_ID: 1a1c4174-0fcd-4ba0-add9-12d08c4077dc ○ ADMIN SHOW DDL JOBS ○ ADMIN SHOW DDL JOB QUERIES job_id [, job_id] …

26.Assessibility ● SQL ○ ADMIN CANCEL DDL JOBS 'job_id' [, 'job_id'] … ■ Some types are not supported ○ ADMIN RESTORE TABLE 'table_name' ○ ADMIN RESTORE TABLE BY JOB 'job_id'

27.Assessibility ● API: curl ○ TiDB info { ■ curl http://{TiDBIP}:10080/info/all "all_servers_info": { "30fdb2f3-d33c-4af0-9d15-75b128f8ae5b": { ■ curl http://{TiDBIP}:10080/info "ddl_id": "30fdb2f3-d33c-4af0-9d15-75b128f8ae5b", ○ Schema info "git_hash": "39ab60557822cba1d48e0ead6ebc8d619e10fe24", ■ curl http://{TiDBIP}:10080/schema "ip": "", ■ curl http://{TiDBIP}:10080/schema/{db} "lease": "45s", "listening_port": 4000, ■ curl http://{TiDBIP}:10080/schema/{db}/{table} "status_port": 10080, ■ curl http://{TiDBIP}:10080/schema?table_id={tableID} "version": "5.7.10-TiDB-v2.1.0-rc.3-324-g39ab60557-dirty" ○ DDL History } }, ■ curl http://{TiDBIP}:10080/ddl/history "is_all_server_version_consistent": true, # If it's false,return the details ■ curl http://{TiDBIP}:10080/ddl/history?limit={number} "owner_id": "30fdb2f3-d33c-4af0-9d15-75b128f8ae5b", "servers_num": 1 }

28.FAQ ● Time spent executing DDL under normal conditions ○ Add index operation ■ depends on the number of data rows in the table and the system load ■ empty table:3s ○ General DDL operation ■ most cases are less than 1s

29.FAQ ● In some cases, running DDL statements is slow ○ If you run multiple DDL statements together ○ After you start the cluster successfully, the first DDL operation ○ Stopping TiDB due to kill -9 command or power failure ○ If there is a communication problem between TiDB and PD

TiDB 是一款定位于在线事务处理/在线分析处理( HTAP: Hybrid Transactional/Analytical Processing)的融合型数据库产品,实现了一键水平伸缩,强一致性的多副本数据安全,分布式事务,实时 OLAP 等重要特性。