- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
MySQL Indexing - Best Practices
MySQL索引-最佳实践
展开查看详情
1 . Percona MySQL Indexing; University Best Practices Ульяновск, россия 10 November 2016 | Peter Zaitsev | Percona
2 . You’ve Made a Great Choice ! Understanding indexing is crucial both for Developers and DBAs Poor index choices are responsible for large portion of production problems Indexing is not a rocket science 2
3 .MySQL Indexing: Agenda Understanding Indexing Setting up best indexes for your applications Working around common MySQL limitations 3
4 .Indexing in the Nutshell • Speed up access in the database What are • Help to enforce constraints (UNIQUE, FOREIGN KEY) indexes • Queries can be ran without any for ? indexes • But it can take a really long time 4
5 . Types of Indexes you might heard about BTREE Indexes • Majority of indexes you deal in MySQL is this type RTREE Indexes • MyISAM only, for GIS HASH Indexes • MEMORY, NDB FULLTEXT Indexes • MyISAM, Innodb starting 5.6 5
6 . Family of BTREE like Indexes A lot of different • Share same properties in what operations they can speed up implementations • Memory vs Disk is life changer B+ Trees are typically • Data stored in leaf nodes used for Disk storage TokuDB Fractal Trees • But physically a lot different are logically similar 6
7 .B+Tree Example Branch/Root Node Less than 3 Data Pointers Leaf Node 7
8 . Indexes in MyISAM vs Innodb In MyISAM data pointers point to physical offset in the In Innodb data file • All indexes are essentially • PRIMARY KEY (Explicit equivalent or Implicit) - stores data in the leaf pages of the index, not pointer • Secondary Indexes – store primary key as data pointer 8
9 . What Operations can BTREE Index do ? Find all rows with KEY=5 (point lookup) Find all rows with KEY>5 (open range) Find all rows with 5<KEY<10 (closed range) NOT find all rows with last digit of the KEY is Zero • This can’t be defined as a “range” operation 9
10 .String Indexes There is no • Sort order is defined for strings (collation) difference… • “AAAA” < “AAAB” really Prefix LIKE is • LIKE “ABC%” means • “ABC[LOWEST]”<KEY<“ABC[HIGHEST]” a special type • LIKE “%ABC” can’t be optimized by use of the of Range index 10
11 .Multiple Column Indexes Sort Order is defined, comparing • KEY(col1,col2,col3) leading column, • (1,2,3) < (1,3,1) then second etc It is still one • not a separate BTREE BTREE Index index for each level 11
12 . Overhead of The Indexing Indexes are costly; Do not add more than you need • In most cases extending index is better than adding new one Writes - Updating indexes is often major cost of database writes Reads - Wasted space on disk and in memory; additional overhead during query optimization 12
13 .Indexing Innodb Tables • Pick PRIMARY KEY what suites you best Data is clustered by • For comments – (POST_ID,COMMENT_ID) can be good PRIMARY KEY storing all comments for single post close Primary Key together • Alternatively “pack” to single BIGINT PRIMARY KEY is • KEY (A) is really KEY (A,ID) internally implicitly appended • Useful for sorting, Covering Index. to all indexes 13
14 .How MySQL Uses Indexes Data Lookups Sorting Avoiding reading “data” Special Optimizations 14
15 .Using Indexes for Data Lookups SELECT * FROM EMPLOYEES WHERE • The classical use of index on (LAST_NAME) LAST_NAME=“Smith” • SELECT * FROM EMPLOYEES WHERE Can use Multiple column LAST_NAME=“Smith” AND indexes DEPT=“Accounting” • Will use index on (DEPT,LAST_NAME) 15
16 . It Gets Tricky With Multiple Columns Index (A,B,C) • - order of columns matters Will use Index for lookup • • A>5 A=5 AND B>6 (all listed keyparts) • • A=5 AND B=6 AND C=7 A=5 AND B IN (2,3) AND C>5 Will NOT use Index • B>5 – Leading column is not referenced • B=6 AND C=7 - Leading column is not referenced Will use Part of the index • A>5 AND B=2 - range on first column; only use this key part • A=5 AND B>6 AND C=2 - range on second column, use 2 parts 16
17 .The First Rule of MySQL Optimizer MySQL will stop using key parts in multi part index as soon as it met the real range (<,>, BETWEEN), it however is able to continue using key parts further to the right if IN(…) range is used 17
18 .Using Index for Sorting SELECT * FROM PLAYERS ORDER • Will use index on SCORE column BY SCORE DESC • Without index MySQL will do “filesort” (external sort) which is very expensive LIMIT 10 Often Combined • SELECT * FROM PLAYERS WHERE COUNTRY=“US” ORDER BY SCORE with using Index for DESC LIMIT 10 lookup • Best served by Index on (COUNTRY,SCORE) 18
19 .Multi Column indexes for efficient sorting It becomes even more restricted! KEY(A,B) Will use Index for Sorting • ORDER BY A - sorting by leading column • A=5 ORDER BY B - EQ filtering by 1st and sorting by 2nd • ORDER BY A DESC, B DESC - Sorting by 2 columns in same order • A>5 ORDER BY A - Range on the column, sorting on the same Will NOT use Index for Sorting • ORDER BY B - Sorting by second column in the index • A>5 ORDER BY B – Range on first column, sorting by second • A IN(1,2) ORDER BY B - In-Range on first column • ORDER BY A ASC, B DESC - Sorting in the different order 19
20 .MySQL Using Index for Sorting Rules You can’t sort in different order by 2 columns You can only have Equality comparison (=) for columns which are not part of ORDER BY • Not even IN() works in this case 20
21 .Avoiding Reading The data “Covering Index” • Applies to index use for specific query, not type of index. Reading Index ONLY and not accessing the “data” • Index is typically smaller than data SELECT STATUS FROM ORDERS WHERE • KEY(CUSTOMER_ID,STATUS) CUSTOMER_ID=123 Access is a lot more sequential • Access through data pointers is often quite “random” 21
22 .Min/Max Optimizations Index help MIN()/MAX() aggregate functions • But only these SELECT MAX(ID) FROM TBL; SELECT MAX(SALARY) FROM EMPLOYEE GROUP BY DEPT_ID • Will benefit from (DEPT_ID,SALARY) index • “Using index for group-by” 22
23 . Indexes and Joins • SELECT * FROM POSTS,COMMENTS WHERE AUTHOR=“Peter” AND MySQL Performs Joins COMMENTS.POST_ID=POSTS.ID as “Nested Loops” • Scan table POSTS finding all posts which have Peter as an Author • For every such post go to COMMENTS table to fetch all comments Index is only needed on table which is being • The index on POSTS.ID is not needed for this query performance looked up Very important to have • Re-Design JOIN queries which can’t be well indexed all JOINs Indexed 23
24 . Using Multiple Indexes for the table MySQL Can use More • “Index Merge” than one index SELECT * FROM • Can often use Indexes on (A) and (B) separately TBL WHERE A=5 • Index on (A,B) is much better AND B=6 SELECT * FROM • 2 separate indexes is as good as it gets TBL WHERE A=5 OR • Index (A,B) can’t be used for this query B=6 24
25 .Prefix Indexes You can • ALTER TABLE TITLE ADD build Index KEY(TITLE(20)); • Needed to index BLOB/TEXT columns on the • Can be significantly smaller leftmost • Can’t be used as covering index prefix of the • Choosing prefix length becomes the question column 25
26 . What is new with MySQL 5.6 ? • Most of them will make your queries better Many Optimizer automatically improvements • join_buffer_size variable has whole new meaning • Values if 32MB+ can make sense Focus on Index • Most important one: ICP (Index Condition Design Practices for Pushdown) this presentation 26
27 .Understanding ICP Push where clause • Think NAME LIKE “%ill%” (will “Conditions” for Storage engine to filter not convert to range) “Much more flexible • Plus filtering done on the engine level – covering Index” efficient • All or none. All is resolved through the Before MySQL 5.5 index or “row” is read if within range 27
28 .ICP Examples SELECT A … WHERE B=2 AND C LIKE “%ill%’ • MySQL 5.5 and below - KEY (B) – Traditional. Using index for range only - KEY (B,C,A) - Covering. All involved columns included • MySQL 5.6 - KEY(B,C) • Range access by B; Filter clause on C only read full row if match More cases • SELECT * …WHERE A=5 and C=6 ; KEY (A,B,C) - Will scan all index entries with A=5 not all rows 28
29 . How MySQL Picks which Index to Use ? Performs dynamic • The constants in query texts matter a picking for every query execution lot Estimates number of rows it needs to access • by doing “dive” in the table for given index Uses “Cardinality” • This is what ANALYZE TABLE statistics if impossible updates 29