使用开源解决方案处理PostgreSQL中的高可用性和自动故障转移

PostgreSQL是一个完全由社区驱动的高级开源数据库。持续的开发和性能改进,同时保持对Oracle兼容性的次要关注,使PostgreSQL获得了巨大的市场渗透力。当在生产环境中部署数据库服务器时,我们通常希望实现几个9的可用性。在PostgreSQL中这是可能的吗?为了在PostgreSQL中实现高可用性(HA)和自动故障转移,您可以组合和实现哪些工具?在这种故障转移过程中,我们如何避免数据丢失?我们将在本次谈话中讨论这些问题,然后再讨论一些其他问题。
我们将讨论**
1。对于PostgreSQL中可用的每种复制类型,HA的实现是如何不同的。
2。如何将haproxy与etcd结合起来,再加上raft算法的详细说明和使用paturi的HA。
3。如何将repmgr与keepalived结合起来实现HA。
4。为Kubernetes上的PostgreSQL构建的HA解决方案。
5。帮助您在AWS和其他基于云的环境中实现自动故障转移的工具和解决方案是什么?
6。如何避免故障转移过程中的巨大数据丢失。

展开查看详情

1.High Availability and Automatic Failover in PostgreSQL Using Open Source Solutions Avinash Vallarapu Percona

2.What is High Availability ? ● High Availability in our routine database life is: ○ An always-on mechanism ○ Avoid data loss during disasters ○ Higher uptime for business ○ An immediate action upon a detection of failure (but not minutes or days) ○ Avoiding a single point of failure ○ Decrease or minimize the unscheduled downtime ○ Seamless database failovers for application and business ○ Ability to perform both manual and automatic failover ○ Faster point-in-time-recovery (PITR)

3.PostgreSQL Replication • Streaming Replication in PostgreSQL: ○ WAL Segments are streamed to Standby/Slave and replayed on Slave. ○ Not a Statement/Row/Mixed Replication like MySQL. ○ This can be referred to as a byte-by-byte or Storage Level Replication ○ Slaves are always Open for Read-Only SQLs but not Writes ○ You cannot have different Schema or data in a Master and a Slave in Streaming Replication. ○ Allows Cascading Replication ○ Supports both Synchronous and Asynchronous Replication ○ Supports a Delayed Standby for faster PITR

4.PostgreSQL Replication • Logical Replication and Logical Decoding for PostgreSQL 10 and above ○ Allows for Replication of selected Tables using Publisher and Subscriber Model. ○ Similar to binlog_do_db in MySQL, but no DDL changes are replicated. ○ Subscribers are also open for Writes automatically ○ Used in Data Warehouse environments that stores data fetched from multiple OLTP databases for Reporting, etc ○ A friendly solution for database upgrades

5.PostgreSQL Features and Extensions for HA and Automatic Failover ● Minimize data loss using Synchronous Replication in PostgreSQL ● May reduce data loss on failover during huge replication lag using the Archiving feature in PostgreSQL ● Faster and easy failover using promote or trigger_file ● Faster catch-up of old Master using the extension pg_rewind ● Re-direct READS and REPORTING jobs to a Slave using hot_standby ● Allow long running reporting jobs on Slave to succeed upon changes on Master, using hot_standby_feedback, max_standby_streaming_delay and max_standby_archive_delay ● Achieve flashback like Oracle features using recovery_min_apply_delay on Slave

6.Manual Failover Using Promote Using promote:

7.Manual Failover Using trigger_file Using trigger_file

8.Open Source Solutions for Automatic Failover in PostgreSQL ● List of few Open Source projects for HA and Automatic Failover: ○ Patroni ○ pg_auto_failover ○ Stolon ○ repmgr ○ PostgreSQL Automatic Failover (PAF) ○ pglookout ○ pgPool-II

9.Discussion on Some of the Most Widely Adopted Tools

10.Patroni ● Patroni ○ Fork of Governor ○ PostgreSQL cluster management template/framework ○ Talks to a distributed consensus key-value store to decide the state of the cluster ○ Distributed consensus can be obtained using etcd, ZooKeeper, Consul, etc for electing a leader ○ Continuous monitoring and automatic failover ○ Built-in automation for bringing back a failed node to cluster ○ REST APIs for cluster configuration and further tooling ○ Provides infrastructure for transparent application failover ○ Distributed consensus for every action and configuration ○ Integration with Linux watchdog for avoiding split-brain syndrome ○ Supports both manual and automatic failover

11.

12.PostgreSQL Operators Zalando Postgres Operator with Patroni : https://github.com/zalando/postgres-operator Crunchy Postgres Operator : https://github.com/CrunchyData/postgres-operator

13.

14.REPMGR ● REPMGR ○ Uses repmgrd installed in each node for management and monitoring ○ Supports both manual and automatic failover ○ Supports configuring a Witness server to avoid split brain scenario ○ Provides a view: replication_status for monitoring and history of replication lag and node status ○ Supports over 18 user-friendly commands to perform actions such as: ▪ Cloning a Master/Primary ▪ Switchover to promote a standby and demote the master ▪ Rejoining a node to cluster ▪ Promote to promote a standby ▪ check node status ▪ primary/standby register and unregister ○ Supports executing custom scripts upon automatic failover using promote_command and follow_command

15.Stolon ● Stolon ○ Cloud-native HA solution that supports PostgreSQL cluster inside Kubernetes, IaaS and VMs ○ Uses etcd, consul or Kubernetes API server for distributed consensus ○ Composed of 3 components: ▪ keeper: Maintains a cluster view as provided by sentinel(s) ▪ sentinel: Monitors keepers and builds the cluster view ▪ proxy: Re-directs connects to Master always for a seamless Application failover ○ Built on top of PostgreSQL Streaming replication - Synchronous and Asynchronous ○ Supports command line client - stolonctl and kubectl to perform actions such as: ▪ Initialize a cluster ▪ Promoting a standby ▪ check status

16.

17.pgPool-II ● pgPool-II ○ Supports Connection Pooling ○ Manages Replication ○ Load Balancing of Reads and Writes ○ Parses SQLs to determine if it is a read or write ○ Ability to configure weights to balance reads between master and slave ○ Supports Automatic Failover ○ Connections exceeding the max_connections are queued on pgPool-II without rejecting them ○ Must use Active-Passive pgPool setup for high availability

18.Points to Note ● Make sure you test the tool you use for automatic failover ● Ensure to have a good backup strategy that helps you manage panic situations ● Be prepared for a data loss and build the ability to manage it from the application ● The architecture of your HA solution depends on your environment ● Build the ability to distinguish reads and writes in the application layer for better scalability ● Perform routine disaster recovery drills through a manual failover to ensure that the setup is reliable ● Ensure you monitor for patches and perform updates of your PostgreSQL and the HA solution

19.Thank You to Our Sponsors

20.Rate My Session 20

21.Any Questions?