- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
16/07 - SASI, Cassandra on the full text search ride by submmit1
展开查看详情
1 .SASI, Cassandra on the full text search ride DuyHai DOAN – Apache Cassandra evangelist
2 . 1 SASI introduction 2 SASI cluster-wide 3 SASI local read/write path 4 Query planner 5 Some benchmarks 6 Take away © DataStax, All Rights Reserved. 2
3 .SASI introduction
4 .What is SASI ? • SSTable-Attached Secondary Index à new 2nd index impl that follows SSTable life-cycle • Objective: provide more performant & capable 2nd index © DataStax, All Rights Reserved. 4
5 .Who created it ? Open-source contribution by an engineers team © DataStax, All Rights Reserved. 5
6 .Why is it better than native 2nd index ? • follow SSTable life-cycle (flush, compaction, rebuild …) à more optimized • new data-strutures • range query (<, ≤, >, ≥) possible • full text search options © DataStax, All Rights Reserved. 6
7 . Demo © DataStax, All Rights Reserved. 7
8 .SASI cluster-wide
9 . Distributed index On cluster level, SASI works exactly like native 2nd index B C UK user87 user176 … user987 A D UK user1 user102 … user493 US user54 user483 … user938 UK user17 user409 … user787 H E G F © DataStax, All Rights Reserved. 9
10 .Distributed search algorithm B C A D 1st round Concurrency factor = 1 H E coordinator G F © DataStax, All Rights Reserved. 10
11 .Distributed search algorithm B C A D Not enough results ? H E coordinator G F © DataStax, All Rights Reserved. 11
12 .Distributed search algorithm B C 2nd round Concurrency factor = 2 A D H E coordinator G F © DataStax, All Rights Reserved. 12
13 .Distributed search algorithm B C A D Still not enough results ? H E coordinator G F © DataStax, All Rights Reserved. 13
14 .Distributed search algorithm B C A D 3rd round Concurrency factor = 4 H E coordinator G F © DataStax, All Rights Reserved. 14
15 .Concurrency factor formula • more details at: http://www.planetcassandra.org/blog/cassandra-native-secondary-index- deep-dive/ © DataStax, All Rights Reserved. 15
16 .Concurrency factor formula But right now … So initial concurrency factor always = max(1, negative number) = 1 for 1st query round with SASI ... © DataStax, All Rights Reserved. 16
17 .Caveat 1: non restrictive filters B C Hit all nodes A D eventually L H E coordinator G F © DataStax, All Rights Reserved. 17
18 .Caveat 1 solution : always use LIMIT B C SELECT * FROM … A D WHERE ... LIMIT 1000 H E coordinator G F © DataStax, All Rights Reserved. 18
19 .Caveat 2: 1-to-1 index (user_email) B C A D Not found WHERE user_email = ‘xxx' H E coordinator G F © DataStax, All Rights Reserved. 19
20 .Caveat 2: 1-to-1 index (user_email) B C A D WHERE user_email = ‘xxx' H E Still no result coordinator G F © DataStax, All Rights Reserved. 20
21 .Caveat 2: 1-to-1 index (user_email) B C A D WHERE user_email = ‘xxx' At best 1 user found At worst 0 user found H E coordinator G F © DataStax, All Rights Reserved. 21
22 .Caveat 2 solution: materialized views For 1-to-1 index/relationship, use materialized views instead CREATE MATERIALIZED VIEW user_by_email AS SELECT * FROM users WHERE user_id IS NOT NULL and user_email IS NOT NULL PRIMARY KEY (user_email, user_id) But range queries ( <, >, ≤, ≥) not possible … © DataStax, All Rights Reserved. 22
23 .Caveat 3: fetch all rows for analytics use-case B C A D Client H E coordinator G F © DataStax, All Rights Reserved. 23
24 .Caveat 3 solution: use co-located Spark B C Local index query A D Local index filtering in Cassandra Aggregation in Spark H E G F © DataStax, All Rights Reserved. 24
25 .SASI local read/write path
26 .Local write path Index files are built • on memtable flush • on compaction flush To avoid OOM, index files are split into chunk of • 1Gb for memtable flush • max_compaction_flush_memory_in_mb for compaction flush © DataStax, All Rights Reserved. 26
27 .Local write path data structures Index mode, data type Data structure Usage PREFIX, text Guava ConcurrentRadixTree name LIKE 'John%' name LIKE ’%John%' CONTAINS, text Guava ConcurrentSuffixTree name LIKE ’%ny’ age = 20 PREFIX, other JDK ConcurrentSkipListSet age >= 20 AND age <= 30 age = 20 SPARSE, other JDK ConcurrentSkipListSet age >= 20 AND age <= 30 suitable for 1-to-N index with N ≤ 5 © DataStax, All Rights Reserved. 27
28 .OnDiskIndex files SStable1 user_id4 FR user_id1 US user_id5 FR OnDiskIndex1 FR US SStable2 B+Tree-like user_id3 UK user_id2 DE data structures OnDiskIndex2 UK DE © DataStax, All Rights Reserved. 28
29 .Local read path • first, optimize query using Query Planer (see later) • then load chunks (4k) of index files from disk into memory • perform binary search to find the indexed value(s) • retrieve the corresponding partition keys and push them into the Partition Key Cache à Yes, currently SASI only keep partition key(s) so on wide partition it’s not very optimized ... © DataStax, All Rights Reserved. 29