- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- <iframe src="https://www.slidestalk.com/u9099/Managing_Data_and_Operation_Distribution_In_MongoDB?embed" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
- 微信扫一扫分享
管理MongoDB中的数据和运行分配
在sharded mongodb集群中,规模和数据分布是由您的shard键定义的。即使在选择正确的切分键时,仍需要进行持续的维护和检查以保持最佳性能。
此演示文稿将回顾碎片键的选择以及块的分布如何创建场景,您可能需要在碎片群集中手动移动、拆分或合并块。需要这些操作的场景可以同时存在于优化和次优化的切分键中。示例用例将提供有关选择切分密钥、检测问题、可能遇到这些场景的原因以及可以采取的纠正问题的具体步骤的提示。
展开查看详情
1 . Managing Data and Operation Distribution In MongoDB Antonios Giannopoulos and Jason Terpko DBA’s @ Rackspace/ObjectRocket linkedin.com/in/antonis/ | linkedin.com/in/jterpko/ 1
2 .Introduction Antonios Giannopoulos Jason Terpko www.objectrocket.com 2
3 .Overview • Sharded Cluster • Shard Keys Selection • Shard Key Operations • Chunk Management • Data Distribution • Orphaned documents • Q&A www.objectrocket.com 3
4 .Sharded Cluster • Cluster Metadata • Data Layer • Query Routing • Cluster Communication www.objectrocket.com 4
5 .Cluster Metadata
6 .Data Layer … s1 s2 sN
7 .Replication Data redundancy relies on an idempotent log of operations.
8 .Query Routing … s1 s2 sN
9 .Sharded Cluster … s1 s2 sN
10 .Cluster Communication How do independent components become a cluster and communicate? ● Replica Set ○ Replica Set Monitor ○ Replica Set Configuration ○ Network Interface ASIO Replication / Network Interface ASIO Shard Registry ○ Misc: replSetName, keyFile, clusterRole ● Mongos Configuration ○ configDB Parameter ○ Network Interface ASIO Shard Registry ○ Replica Set Monitor ○ Task Executor ● Post Add Shard ○ Collection config.shards ○ Replica Set Monitor ○ Task Executor Pool ○ config.system.sessions
11 .Primary Shard Database <foo> … s1 s2 sN
12 .Collection UUID With featureCompatibilityVersion 3.6 all collections are assigned an immutable UUID. Cluster Metadata config.collections Data Layer (mongod) config.collections
13 .Collection UUID With featureCompatibilityVersion 3.6 all collections are assigned an immutable UUID. Cluster Metadata config.collections Data Layer (mongod) config.collections Important • UUID’s for a namespace must match • Use 4.0+ Tools for a sharded cluster restore
14 .Shard Key - Selection • Profiling • Identify shard key candidates • Pick a shard key • Challenges www.objectrocket.com 14
15 .Sharding Shards are Physical Partitions Chunks are Logical Partitions Database <foo> Collection <foo> … s1 s2 sN chunk chunk chunk chunk chunk chunk 15
16 . What is a Chunk? The mission of the shard key is to create chunks The logical partitions your collection is divided into and how data is distributed across the cluster. ● Maximum size is defined in config.settings ○ Default 64MB ● Before 3.4.11: Hardcoded maximum document count of 250,000 ● Version 3.4.11 and higher: 1.3 configured chunk size by the average document size ● Chunk map is stored in config.chunks ○ Continuous range from MinKey to MaxKey ● Chunk map is cached at both the mongos and mongod ○ Query Routing ○ Sharding Filter ● Chunks distributed by the Balancer ○ Using moveChunk ○ Up to maxSize
17 .Shard Key Selection Profiling Helps identify your workload Requires Level 2 – db.setProfilingLevel(2) May need to increase profiler size www.objectrocket.com 17
18 .Shard Key Selection Profiling Candidates Export statements types with frequency Export statement patterns with frequency Produces a list of shard key candidates www.objectrocket.com 18
19 .Shard Key Selection Build-in Profiling Candidates Constraints Key and Value is immutable Must not contain NULLs Update and findAndModify operations must contain shard key Unique constraints must be maintained by a prefix of shard key A shard key cannot contain special index types (i.e. text) Potentially reduces the list of candidates www.objectrocket.com 19
20 .Shard Key Selection Build-in Schema Profiling Candidates Constraints Constraints Cardinality Monotonically increased Data Hotspots Operational Hotspots Targeted vs Scatter-gather operations www.objectrocket.com 20
21 .Shard Key Selection Build-in Schema Profiling Candidates Future Constraints Constraints Poor cardinality Growth and data hotspots Data pruning & TTL indexes Schema changes Try to simulate the dataset in 3,6 and 12 months www.objectrocket.com 21
22 .Shard key - Operations • Apply a shard key • Revert a shard key www.objectrocket.com 22
23 .Apply a shard key Create the associated index Make sure the balancer is stopped: sh.stopBalancer() sh.getBalancerState() Apply the shard key: sh.shardCollection(“foo.col”,{field1:1,...,fieldN:1}) Allow a burn period Start the balancer www.objectrocket.com 23
24 .Sharding sh.ShardCollection({foo.foo},<key>) Burn Period sh.startBalancer() Database <foo> Collection <foo> … s1 s2 sN chunk chunk chunk chunk chunk chunk
25 .Revert a shard key Two categories: o Affects functionality (exceptions, inconsistent data,…) o Affects performance (operational hotspots…) Dump/Restore o Requires downtime – write and in some cases read o Time consuming operation o You may restore on a sharded or unsharded collection o Better pre-create indexes o Same or new cluster can be used o Streaming dump/restore is an option o On special cases, like time series data can be fast www.objectrocket.com 25
26 .Revert a shard key Dual writes o Mongo to Mongo connector or Change streams o No downtime o Requires extra capacity o May Increase latency o Same or new cluster can be used o Adds complexity Alter the config database o Requires downtime – but minimal o Easy during burn period o Time consuming, if chunks are distributed o Has overhead during chunk moves www.objectrocket.com 26
27 .Revert a shard key Process: 1) Disable the balancer – sh.stopBalancer() 2) Move all chunks to the primary shard (skip during burn period) 3) Stop one secondary from the config server ReplSet (for rollback) 4) Stop all mongos and all shards 5) On the config server replset primary execute: db.getSiblingDB(‘config’).chunks.remove({ns:<collection name>}) db.getSiblingDB(‘config’).collections.remove({_id:<collection name>}) 6) Start all mongos and shards 7) Start the secondary from the config server replset Rollback: • After step 6, stop all mongos and shards • Stop the running members of the config server ReplSet and wipe their data directory • Start all config server replset members • Start all mongos and shards www.objectrocket.com 27
28 .Revert a shard key Online option requested on SERVER-4000 - May be supported in 4.2 Further reading - Morphus: Supporting Online Reconfigurations in Sharded NoSQL Systems http://dprg.cs.uiuc.edu/docs/ICAC2015/Conference.pdf Special use cases: Extend a shard key, by adding field(s) ({a:1} to {a:1,b:1}) o Possible (and easier) if b’s max and min (per a) are predefined o For example {year:month} to be extended to {year:month:day} Reduce the elements of a shard key (({a:1, b:1} to {a:1}) o Possible (and easier) if all distinct “a” values are in the same shard o There aren’t chunks with the same “a.min” (adds complexity) www.objectrocket.com 28
29 . Revert a shard key Always preform a dry-run Balancer/Autosplit must be disabled You must take downtime during the change *There might be a more optimal code path but the above one worked like a charm www.objectrocket.com 29