In this article, I will explain my hands-on approach PostgreSQL Patroni cluster with WAL-G and Minio. The purpose of this article to automate PostgreSQL full and incremental backups with a guaranteed and redundant place. I mean by using this environment you will achieve availability, redundancy, and daily full and incremental backups.
- PostgreSQL - stores all data of Microservices applications.
- Patroni - cluster controller of PostgreSQL servers. Wonderful API gives us a lot of functionality to control and understand the cluster status of the PostgreSQL. A simple GET request of Patroni 8008 port can give us 503 (Means this is slave node) or 200 (Means this is a master node) response codes. Even from response JSON content, we can catch the status of Master and Slave nodes PostgreSQL. Patroni can store cluster information inside of the ETCD, Consul, etc.
- WAL-G - is an archival restoration tool for PostgreSQL. WAL-G tool which can get full and incremental backups from the PostgreSQL server. With WAL-G we can get, delete or restore a backup for some pre-defined time interval.
- Minio is a High-Performance Object Storage. It is API compatible with Amazon S3 cloud storage service. Minio gives us storage compatibility with API. We can store data with high availability of Minio servers. We can restrict access to the users by bucket names. Wonderful SDK and documentation.
- HAProxy - Responsible to find PostgreSQL master node for microservice application communication and load balancer for Minio nodes. Loadbalancer for microservice applications to communicate with PostgreSQL master node and, at the same time PostgreSQL master node will communicate to get/store base backups with WAL files. Between HaProxy servers, I have used KeepAlived with Virtual IP address 10.1.42.225.
Network topology like as following:
$ vagrant plugin install vagrant-berkshelf
$ vagrant plugin install vagrant-hostmanager
$ vagrant plugin install vagrant-hostsupdater
$ vagrant plugin install vagrant-scp
$ vagrant plugin install vagrant-vbguest
$ vagrant plugin install vagrant-reload
When environment will be ready just clone the this repository and switch to the vagrant-codes-in-practice/patronipg-walg-halb folder. Then execute the command below (It will prepare all environment shown in previous screen):
$ vagrant up
After deployment, we can work in the PostgreSQL patroni cluster environment which will automatically store all incremental backups inside of the Minio cluster. At the same time we can delete (with retention period) old WAL (Write Ahead Logs)files from Minio within wal-g command itself.
HaProxy shows us status of all balanced servers:
The following screen from Virtual IP address of HaProxy(10.1.42.225:9001) which shows us WAL files from PostgreSQL master node stored inside of the Minio. The bucket name configured in the Minio is postgresql.
I want to explain the main part of postgresql.conf configuration for the Minio which defined in the pgservers.sh file. The lines from 52 to 54 shows us how PostgreSQL will push all incremental WAL changes (backups of logs) with archive_command to the Minio and how will retrieve data from Minio with the restore_command line as following:
archive_command: "/usr/local/bin/wal-push.sh %p" restore_command: "/usr/local/bin/wal-fetch.sh %f %p"
To get base (Full) backup inside of the PostgreSQL master node switch to the current WAL segment and execute /usr/local/bin/backup-push.sh command (Commands must be executed on the PostgreSQL Master node and only unders postgres user otherwise script will not find the needed variables for the Minio bucket):
[root@patpg1 ]# su - postgres
postgres=# select pg_switch_wal();
With the /usr/local/bin/backup-list.sh command we can see list of the base (Full) backups. As we see we didn’t get any full backup (Must be executed under master node and with postgres username):
INFO: 2020/09/26 15:53:36.227857 No backups found
To see all processes you can watch this youtube video.
I hope it will be useful to everyone.