Sharding in MongoDB 101 - Geo-Partitioning

标记可用于根据位置或应用程序使用的任何其他参数选择保存数据的位置。有了标签,我们可以保证来自美国的用户只会将他们的数据写入美国数据中心,而来自欧洲的用户只会写入欧洲数据中心。这是MongoDB Sharding 101系列的第二部分。要观看第一部分,请单击此处。

展开查看详情

1. Sharding in MongoDB 101 - Geo-Partitioning Percona Webinar July 26th, 2018, at 12:30 PM PDT (UTC-7) / 3:30 PM EDT (UTC-4). 1 © 2018 Percona - COMPANY CONFIDENTIAL

2.Me - @adamotonete 2 © 2018 Percona - COMPANY CONFIDENTIAL

3.Agenda ● Starting a sharded cluster from scratch ● Geo-partitioning data ● Config database overview ● Range key ● Hashed key ● Q&A 3 © 2018 Percona - COMPANY CONFIDENTIAL

4.Starting a sharded cluster from scratch ● Environment that will be used in this webinar ( 1 single machine ) - MongoDB v3.6.x - MacOS X - 8 GB RAM - Dual core processor 4 © 2018 Percona - COMPANY CONFIDENTIAL

5.Starting a sharded cluster from scratch ● Starting the shards For testing purposes we will start 3 single members replica-sets. - shard001, shard002, shard003; ./mongod --dbpath rs1Pdata --shardsvr --logpath rs1Pdata/log --fork --port 29001 --replSet shard001 --bind_ip <yourip>,localhost ./mongod --dbpath rs2Pdata --shardsvr --logpath rs2Pdata/log --fork --port 29002 --replSet shard002 --bind_ip <yourip>,localhost ./mongod --dbpath rs3Pdata --shardsvr --logpath rs2Pdata/log --fork --port 29003 --replSet shard003 --bind_ip <yourip>,localhost ● Starting the replicasets ./mongo --port 29001 --eval "printjson(rs.initiate({_id : 'shard001', members :[{_id : 0, host : '<yourip>:29001'}]}))" ./mongo --port 29002 --eval "printjson(rs.initiate({_id : 'shard002', members :[{_id : 0, host : '<yourip>:29002'}]}))" ./mongo --port 29003 --eval "printjson(rs.initiate({_id : 'shard003', members :[{_id : 0, host : '<yourip>:29003'}]}))" 5 © 2018 Percona - COMPANY CONFIDENTIAL

6.Starting a sharded cluster from scratch ● Starting the config server ./mongod --dbpath cfgdata --configsvr --logpath cfgdata/log --fork --port 27019 --replSet cfgRS --bind_ip <yourip>,localhost ● Starting the config server replica-set ./mongo --port 27019 --eval "printjson(rs.initiate({_id : 'cfgRS', members :[{_id : 0, host : '192.168.88.20:27019'}]}))" 6 © 2018 Percona - COMPANY CONFIDENTIAL

7.Starting a sharded cluster from scratch ● Starting the mongos ./mongos --configdb "cfgRS/<yourip>:27019" --logpath mongoslog/log --bind_ip <yourip>,localhost --fork 7 © 2018 Percona - COMPANY CONFIDENTIAL

8.Starting a sharded cluster from scratch ● Adding the shards to the cluster sh.addShard('shard001/<yourip>:29001') sh.addShard('shard002/<yourip>:29002') sh.addShard('shard003/<yourip>:29003') 8 © 2018 Percona - COMPANY CONFIDENTIAL

9.Starting a sharded cluster from scratch ● Configuring tags sh.addShardTag("shard001", "NA") sh.addShardTag("shard002", "NA") sh.addShardTag("shard003", "EU") 9 © 2018 Percona - COMPANY CONFIDENTIAL

10.Starting a sharded cluster from scratch ● Inserting dummy data use percona_tag db.events.insert({name :'percona live', location : 'NA', city : 'Santa_Clara'}) db.events.insert({name :'percona live europe', location : 'EU', city : 'Frankfurt'}) db.events.ensureIndex({location : 1, _id : 1}) sh.enableSharding('percona') sh.shardCollection('percona.events', {location : 1, _id : 1}) 10 © 2018 Percona - COMPANY CONFIDENTIAL

11.Live setup 11 © 2018 Percona - COMPANY CONFIDENTIAL

12.Our environment Config Server mongos Tag: NA Tag: EU Shard1 Shard2 Shard3 12 © 2018 Percona - COMPANY CONFIDENTIAL

13.Configuring Zones ● Ranges are managed by collections. We must explicitly specify BOTH the collection and where we want the data to be saved. sh.addTagRange( "percona.events", {"location" : "NA","_id" : MinKey }, { "location" : "NA", "_id" : MaxKey }, "NA" ) sh.addTagRange( "percona.events", { "location" : "EU", "_id" : MinKey }, { "location" : "EU", "_id" : MaxKey }, "EU" ) 13 © 2018 Percona - COMPANY CONFIDENTIAL

14.Live demo 14 © 2018 Percona - COMPANY CONFIDENTIAL

15.Playing around ● sh.status() ● Config database ● Why isn’t my database partitioned automatically? ● What is the default shard? ● Chunk size, may I change it? ● Turning off/on the balancer ● Changing the shard key 15 © 2018 Percona - COMPANY CONFIDENTIAL

16.Data distribution ● Comparing data distribution according to the shard key - How are the chunks distributed according to the shard key? 16 © 2018 Percona - COMPANY CONFIDENTIAL

17.Data distribution ● Range value 17 © 2018 Percona - COMPANY CONFIDENTIAL

18.Data distribution ● Hashed key 18 © 2018 Percona - COMPANY CONFIDENTIAL

19.Data distribution ● Comparing data distribution according to the shard key - How are the chunks distributed according to the shard key? 19 © 2018 Percona - COMPANY CONFIDENTIAL

20.Live demo - Using different Shard keys 20 © 2018 Percona - COMPANY CONFIDENTIAL

21.sh.status() Describe details of the cluster 21 © 2018 Percona - COMPANY CONFIDENTIAL

22.sh.status() - config database ● Most of the data printed out in the sh.status are in the config database; ● Config database holds all the cluster metadata. 22 © 2018 Percona - COMPANY CONFIDENTIAL

23.Default shard ● Databases are not sharded automatically, we must specify a collection + a shard key to let the database know how to distribute the data. ● If we don't specify a shard key, the database will live only in one shard, which is called default shard. 23 © 2018 Percona - COMPANY CONFIDENTIAL

24.Chunk size ● The default chunk size is 64 MB. However, it is possible to change the values to a bigger size, and as expected the database that manages the chunk size is the config database: Use config db.settings.save( { _id:"chunksize", value: <sizeInMB> } ) 24 © 2018 Percona - COMPANY CONFIDENTIAL

25.Balancer ● Balancer is the process responsible for moving data around shards. It is possible to enable or disable a balancer or even configure the time of the day when the balancer will work. ● While running, balancer may demand resources and slow the entire cluster down. 25 © 2018 Percona - COMPANY CONFIDENTIAL

26.Changing Shard key ● It is not possible to change shard keys online - in order to do so it is necessary to dump the entire collection and then re-import it with a new pre-configured shard key. 26 © 2018 Percona - COMPANY CONFIDENTIAL

27.Wrapping up ● Your data distribution depends on the shard key ● The config database holds all the chunk metadata ● In order to segment data by location, the shard key must include the field responsible to do so. Tags change the way the data is distributed. ● Hashed keys are good for data distribution, but if querying by range will send the request to all the shards. ● Choose a good shard key. It is not possible to change it online. 27 © 2018 Percona - COMPANY CONFIDENTIAL

28. Questions? 28 © 2018 Percona - COMPANY CONFIDENTIAL

29. November 5-7th, 2018 Call for Papers Open! Get Your Tickets for Percona Connect. Accelerate. Innovate. Live Europe! ● We’re delighted to be in Frankfurt this year! A ● MySQL, MongoDB, PostgreSQL, SQL, NewSQL, vibrant city, in a central location with many NoSQL direct flights makes it easy for you to get here! ● Security, Open Source Databases, Serverless, ● Percona Live Europe 2018 includes a new Cloud or On Premise business track that covers the best ideas for ● High Availability, Scalability how open source databases and database ● Business Goals, What the Future Holds technologies can address and solve business ● Radisson Blu Hotel, Frankfurt, Germany issues such as application time-to-market, resource costs, OPEX and CAPEX expenses, etc. Submit Your Proposal by August 10th! Super Saver Tickets Available Until August 19th! Prices Increase on the 20th! www.percona.com/live/e18/