- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
MongoDB HA, what can go wrong
冗余和高可用性是所有生产部署的基础。使用MongoDB,这可以通过部署副本集来实现。在本文中,我们将探讨MongoDB复制是如何工作的,以及复制集的组件是什么。使用错误部署配置的示例,我们将重点介绍如何在生产环境中正确运行副本集,无论是在内部部署还是在云环境中。
-MongoDB复制工作原理
-副本集组件/部署类型
-错误部署配置的实践
-隐藏节点、仲裁节点、优先级0节点
-单一区域的可用性区域和HA
-监视副本集状态
展开查看详情
1 .MongoDB HA, what can go wrong? Nov-7-2018
2 .About me ● Location: Skopje, Republic of Macedonia ● Education: MSc, Software Engineering ● Experience: ○ Lead Database Consultant (since 2016) ○ Database Consultant (2012 - 2016) ○ Web Developer, DBA (2007 - 2012) ● Certifications: C100DBA - MongoDB certified DBA (since 2016) ● Percona speaker since 2016 https://mk.linkedin.com/in/igorle @igorle © 2018 Pythian. Confidential
3 .Overview • What is replica set, how replication works • Replication concept • Replica set features, deployment architectures • Hidden nodes, Arbiter nodes, Priority 0 nodes • Production failures • Monitoring replica set • QA © 2018 Pythian. Confidential
4 .Replication © 2018 Pythian. Confidential
5 .Replica set • Group of mongod processes that maintain the same data set • Redundancy and high availability • Increased read capacity (scaling reads) • Automatic failover # Members # Nodes Required to Elect New Primary Fault Tolerance 3 2 1 4 3 1 5 3 2 6 4 2 7 4 3 © 2018 Pythian. Confidential
6 .Replication concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 1. 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary © 2018 Pythian. Confidential
7 .Replication concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 1. 4. Secondaries copy the Primary oplog 2. oplog 5. Secondary can use sync source Secondary © 2018 Pythian. Confidential
8 .Replication concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 1. 4. Secondaries copy the Primary oplog 2. oplog 5. Secondary can use sync source Secondary 3. 3. © 2018 Pythian. Confidential
9 .Replication concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 1. 4. Secondaries copy the Primary oplog 2. oplog 5. Secondary can use sync source Secondary 3. 3. 4. 4. © 2018 Pythian. Confidential
10 .Replication concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 1. 4. Secondaries copy the Primary oplog 2. oplog 5. Secondary can use sync source Secondary* 3. 3. 4. 4. *settings.chainingAllowed (true by default) 5. © 2018 Pythian. Confidential
11 .Replica set oplog • Special capped collection that keeps a rolling record of all operations that modify the data stored in the databases • Idempotent • Default oplog size For Unix and Windows systems Storage Engine Default Oplog Size Lower Bound Upper Bound In-memory 5% of physical memory 50MB 50GB WiredTiger 5% of free disk space 990MB 50GB MMAPv1 5% of free disk space 990MB 50GB © 2017 Pythian. Confidential
12 .Configuration © 2018 Pythian. Confidential
13 .Configuration options • 50 members per replica set (7 voting members) • Arbiter node • Priority 0 node • Hidden node • Delayed node © 2018 Pythian. Confidential
14 .Arbiter node • Does not hold copy of data • Votes in elections Arbiter hidden : true © 2018 Pythian. Confidential
15 .Priority 0 node Priority - floating point (i.e. decimal) number between 0 and 1000 • Cannot become primary, cannot trigger election • Visible to application (accepts reads/writes) • Votes in elections Secondary priority : 0 © 2018 Pythian. Confidential
16 .Hidden node • Not visible to application • Never becomes primary, but can vote in elections • Use cases ○ reporting ○ backups Secondary hidden : true priority : 0 hidden: true priority:0 hidden : true © 2018 Pythian. Confidential
17 .Delayed node • Must be priority 0 member • Should be hidden member (not mandatory) • Mainly used for backups (historical snapshot of data) • Recovery in case of human error Secondary slaveDelay : 3600 priority : 0 hidden : true © 2018 Pythian. Confidential
18 .Failures © 2018 Pythian. Confidential
19 .Small oplog size 1. Primary/Secondary node down ○ Node failure ○ Planned maintenance 2. Automatic Failover …… (several hours later) 3. New Primary overwrites latest oplog 4. Failed Node needs resync MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000}) © 2018 Pythian. Confidential
20 .Arbiter nodes ● Votes in election ● Does not hold copy of data ● If 2 nodes are down, no majority to elect new Primary Heartbeat ● Fault tolerance is still 1 node ● 4 data nodes + 1 Arbiter makes more sense © 2018 Pythian. Confidential
21 .Priority 0 nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries can serve reads ● Read preference ○ primary ○ primaryPreferred ○ secondary ○ secondaryPreferred ○ nearest © 2018 Pythian. Confidential
22 .Priority 0 nodes • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary • Application can not send writes • Database is read only* *depends on read preference setting © 2018 Pythian. Confidential
23 .Hidden nodes ● Application driver sends writes to Primary ● Reads go to Primary by default ● Secondaries cannot serve reads ● Read preference ○ primary © 2018 Pythian. Confidential
24 .Hidden nodes • Primary node fails • Replica set starts election for new Primary • Zero nodes eligible for Primary (priority:0) • Application can not send writes/reads • Downtime © 2018 Pythian. Confidential
25 .Hardware • Primary node fails 64GB RAM, 16 CPU • Secondary elected as new Primary • Working set does not fit in memory • Performance degradation • Application stalls 32GB RAM, 8 CPU 32GB RAM, 8 CPU © 2018 Pythian. Confidential
26 .Hardware • Dataset grows Disk: 300GB • No Disk space on Secondary • mongod process fail • 2 nodes replica set • Zero tolerance for failures Disk: 300GB Disk: 200GB © 2018 Pythian. Confidential
27 .Cloud deployment • All replica set members deployed in single Availability Zone • Availability Zone #1 goes down • Downtime AWS Availability Zone #1 Region #1 © 2018 Pythian. Confidential
28 .Cloud deployment ● Availability Zone #1 goes down ○ New Primary elected from AZ #2 ● Availability Zone #2 goes down ○ Database is read only AWS Availability Zone #1 Availability Zone #2 Region #1 © 2018 Pythian. Confidential
29 .Cloud deployment • Region #1 goes down • Downtime AWS AZ #1 AZ #2 AZ #3 Region #1 © 2018 Pythian. Confidential