16/07 - SASI, Cassandra on the full text search ride by submmit1

1 SASI introduction 2 SASI cluster-wide 3 SASI local read/write path 4 Query planner 5 Some benchmarks 6 Take away
展开查看详情

1.SASI, Cassandra on the full text search ride DuyHai DOAN – Apache Cassandra evangelist

2. 1 SASI introduction 2 SASI cluster-wide 3 SASI local read/write path 4 Query planner 5 Some benchmarks 6 Take away © DataStax, All Rights Reserved. 2

3.SASI introduction

4.What is SASI ? •  SSTable-Attached Secondary Index à new 2nd index impl that follows SSTable life-cycle •  Objective: provide more performant & capable 2nd index © DataStax, All Rights Reserved. 4

5.Who created it ? Open-source contribution by an engineers team © DataStax, All Rights Reserved. 5

6.Why is it better than native 2nd index ? •  follow SSTable life-cycle (flush, compaction, rebuild …) à more optimized •  new data-strutures •  range query (<, ≤, >, ≥) possible •  full text search options © DataStax, All Rights Reserved. 6

7. Demo © DataStax, All Rights Reserved. 7

8.SASI cluster-wide

9. Distributed index On cluster level, SASI works exactly like native 2nd index B C UK user87 user176 … user987 A D UK user1 user102 … user493 US user54 user483 … user938 UK user17 user409 … user787 H E G F © DataStax, All Rights Reserved. 9

10.Distributed search algorithm B C A D 1st round Concurrency factor = 1 H E coordinator G F © DataStax, All Rights Reserved. 10

11.Distributed search algorithm B C A D Not enough results ? H E coordinator G F © DataStax, All Rights Reserved. 11

12.Distributed search algorithm B C 2nd round Concurrency factor = 2 A D H E coordinator G F © DataStax, All Rights Reserved. 12

13.Distributed search algorithm B C A D Still not enough results ? H E coordinator G F © DataStax, All Rights Reserved. 13

14.Distributed search algorithm B C A D 3rd round Concurrency factor = 4 H E coordinator G F © DataStax, All Rights Reserved. 14

15.Concurrency factor formula •  more details at: http://www.planetcassandra.org/blog/cassandra-native-secondary-index- deep-dive/ © DataStax, All Rights Reserved. 15

16.Concurrency factor formula But right now … So initial concurrency factor always = max(1, negative number) = 1 for 1st query round with SASI ... © DataStax, All Rights Reserved. 16

17.Caveat 1: non restrictive filters B C Hit all nodes A D eventually L H E coordinator G F © DataStax, All Rights Reserved. 17

18.Caveat 1 solution : always use LIMIT B C SELECT * FROM … A D WHERE ... LIMIT 1000 H E coordinator G F © DataStax, All Rights Reserved. 18

19.Caveat 2: 1-to-1 index (user_email) B C A D Not found WHERE user_email = ‘xxx' H E coordinator G F © DataStax, All Rights Reserved. 19

20.Caveat 2: 1-to-1 index (user_email) B C A D WHERE user_email = ‘xxx' H E Still no result coordinator G F © DataStax, All Rights Reserved. 20

21.Caveat 2: 1-to-1 index (user_email) B C A D WHERE user_email = ‘xxx' At best 1 user found At worst 0 user found H E coordinator G F © DataStax, All Rights Reserved. 21

22.Caveat 2 solution: materialized views For 1-to-1 index/relationship, use materialized views instead CREATE MATERIALIZED VIEW user_by_email AS SELECT * FROM users WHERE user_id IS NOT NULL and user_email IS NOT NULL PRIMARY KEY (user_email, user_id) But range queries ( <, >, ≤, ≥) not possible … © DataStax, All Rights Reserved. 22

23.Caveat 3: fetch all rows for analytics use-case B C A D Client H E coordinator G F © DataStax, All Rights Reserved. 23

24.Caveat 3 solution: use co-located Spark B C Local index query A D Local index filtering in Cassandra Aggregation in Spark H E G F © DataStax, All Rights Reserved. 24

25.SASI local read/write path

26.Local write path Index files are built •  on memtable flush •  on compaction flush To avoid OOM, index files are split into chunk of •  1Gb for memtable flush •  max_compaction_flush_memory_in_mb for compaction flush © DataStax, All Rights Reserved. 26

27.Local write path data structures Index mode, data type Data structure Usage PREFIX, text Guava ConcurrentRadixTree name LIKE 'John%' name LIKE ’%John%' CONTAINS, text Guava ConcurrentSuffixTree name LIKE ’%ny’ age = 20 PREFIX, other JDK ConcurrentSkipListSet age >= 20 AND age <= 30 age = 20 SPARSE, other JDK ConcurrentSkipListSet age >= 20 AND age <= 30 suitable for 1-to-N index with N ≤ 5 © DataStax, All Rights Reserved. 27

28.OnDiskIndex files SStable1 user_id4 FR user_id1 US user_id5 FR OnDiskIndex1 FR US SStable2 B+Tree-like user_id3 UK user_id2 DE data structures OnDiskIndex2 UK DE © DataStax, All Rights Reserved. 28

29.Local read path •  first, optimize query using Query Planer (see later) •  then load chunks (4k) of index files from disk into memory •  perform binary search to find the indexed value(s) •  retrieve the corresponding partition keys and push them into the Partition Key Cache à Yes, currently SASI only keep partition key(s) so on wide partition it’s not very optimized ... © DataStax, All Rights Reserved. 29