WiredTiger B-Tree and WiredTiger In-Memory

加入Sveta Smirnova,首席技术服务工程师,她介绍“wiredtiger B-tree vs wiredtiger in memory”。
在wiredtiger引擎中,许多人都知道它对btree(默认)的支持以及启用lsm的能力,但是很少有人谈论这个引擎的内存版本。在本课程中,我们将讨论当您移动到内存中时,您得到了什么特性,而不是丢失了什么特性。我们还将讨论一些使用内存选项有意义的用例。这是一个功能概述,而不是内部结构,因此它将帮助每个人更多地了解这个版本的wiredtiger。

展开查看详情

1. WiredTiger In-Memory vs WiredTiger B-Tree October, 5, 2016 — Mövenpick Hotel — Amsterdam Sveta Smirnova

2. Table of Contents ∙What is Percona Memory Engine for MongoDB? ∙Typical use cases ∙Advanced Memory Engine 2

3. What is Percona Memory Engine for MongoDB? 3

4. Extremely fast In-Memory storage ∙ Up to 1000 times faster for OLTP wokloads ∙ 10 times faster for read-only workloads ∙ Stable throughput ∙ No checkpointing ∙ No jitter 4

5. Based on WiredTiger ∙ Document-level locking ∙ B-Tree ∙ Practically WiredTiger, but without disk access 5

6. WiredTiger without storage ∙ Doesn’t store data on disk ∙ Except small amount of statistics ∙ You can control when to log statistics with option –inMemoryStatisticsLogDelaySecs ∙ Still must specify –dbpath sveta@Thinkie:~/mongo_tests$ ls -lh single/ total 40K drwxrwxr-x 2 sveta sveta 4,0K Eyl 29 15:00 diagnostic.data -rw-r--r-- 1 sveta sveta 6 Eyl 29 14:58 mongod.lock -rw-rw-r-- 1 sveta sveta 93 Eyl 29 14:55 storage.bson ∙ Data does not persist between restarts 6

7. How to enable Memory Engine? ∙ –storageEngine=inMemory ∙ Can be only engine on MongoDB server ∙ MongoDB restriction, applicable to all engines ∙ Heterogeneous replication and sharding setups supported 7

8. How to control memory usage ∙ Engine can use up to –inMemorySizeGB ∙ If data exceeds this amount ∙ WT CACHE FULL error is returned for all kinds of operations that cause user data size to grow INSERT CREATE UPDATE ∙ Reads are not affected 8

9. Open Source ∙ 100% Open Source ∙ Code available at GitHub ∙ Free for all Percona users and customers 9

10. Typical use cases for Percona Memory Engine 10

11. Application cache ∙ Session management ∙ Store active sessions in memory ∙ Users will receive answer almost immediately ∙ Reduce application response time dramatically ∙ Various temporary collections ∙ All you used to store in memcached 11

12. Transient Runtime State ∙ Application runtime data which does not require on-disk storage ∙ Intermediary results of calculations ∙ User-specific options ∙ Your idea 12

13. Sophisticated data manipulation ∙ Thousand-lines aggregations ∙ Temporary collections to store intermediary data ∙ Complicated queries 13

14. Real-Time Analytics ∙ Large aggregations might be slow ∙ Especially if use many collections ∙ Often this is not avoidable ∙ To calculate number of distinct values you need to read whole index ∙ Fast dedicated server is great solution 14

15. Multi-tier object sharing ∙ Data sharing between multi-tier or multi-language applications ∙ Articles ∙ Pictures English labels ∙ Contact Russian labels information ∙ Other content 15

16. Application Testing ∙ Are you tired to wait when data, needed for application test, loads? ∙ Any change in test data causes delay? ∙ With Memory engine you can reduce turnaround time for automated application tests. ∙ And still use same syntax 16

17. Advanced Percona Memory Engine 17

18. Best of both worlds ∙ Are you amazed with speed of the Memory engine? ∙ But still need data to persist between restarts? ∙ You can combine both Memory and Wired Tiger in Replica Set or Sharded Cluster 18

19. Hidden WiredTiger, storing changes in Replica Set ∙ Setup 2 or more Memory replicas which can be Primary ∙ Let WiredTiger to persist data on disk ∙ In rare cases if all Memory replicas crash at the same time you will loose few transactions ∙ Number of transactions depends on the latency between In-Memory Primary replica and WiredTiger replica 19

20. Hidden WiredTiger, storing changes in Replica Set Memory Memory WiredTiger 20

21. Hidden WiredTiger to store on disk: example setup rs.initiate( ... { ... "_id" : "rs", ... "members" : [ ... {"_id" : 0, "host" : "inMemory1", "priority" : 1}, ... {"_id" : 1, "host" : "inMemory2", "priority" : 1}, ... {"_id" : 2, "host" : "WiredTiger", "priority" : 0, "hidden" : true} ... ] ... } ) 21

22. WiredTiger as Primary in Replica Set ∙ Make WiredTiger Primary ∙ Move all reads to read-only Memory replicas ∙ Writes will be slow 22

23. WiredTiger as Primary in Replica Set WiredTiger Memory Memory 23

24. Scaling beyond the RAM of a single server ∙ Create Sharded Cluster using Memory nodes only ∙ Split data between nodes ∙ Create copies of data to prevent data loss 24

25. Scaling beyond the RAM Shard 1 Shard 2 25

26. Scaling beyond the RAM: add redundancy R2 R5 Shard 1 Shard 2 R1 R3 R4 R6 26

27. Memory and WiredTiger in Sharded Cluster ∙ You can use both engines in the Sharded Cluster ∙ Split data ∙ Session data on Memory nodes ∙ Persistent data on WiredTiger node(s) ∙ Duplicate Memory shards to avoid loosing data 27

28. Sharded Cluster Memory and Replica Sets ∙ You can have sharded nodes which use Memory engine ∙ Make them parts of Replica Set ∙ Let hidden WiredTiger member to persist data on disk 28

29. Example: blog application ∙ User posts changing rarely are stored on disk ∙ Session data stored using Memory engine ∙ Active comments (last 24 hours) and actively accessed posts are cached in Memory node 29