Skip to content

PostgreSQL High Availability with Patroni: 2026 Edition

Running a single PostgreSQL instance in production is a liability. When it goes down — and it will — so does your application. High availability is not optional for production databases; it is the baseline.

This guide builds a production-grade PostgreSQL HA cluster using the fully open-source community stack: PostgreSQL 17 from PGDG, Patroni 4.x, etcd 3.5, pgbackrest 2.58, and HAProxy. No vendor-specific packages. No proprietary repositories. Everything here runs on Rocky Linux 9 or AlmaLinux 9 and stays fully open-source.


What You Are Building

                        ┌─────────────────────────────┐
   Applications         │         HAProxy              │
   ──────────►          │  :5000 (writes → primary)    │
                        │  :5001 (reads → replicas)    │
                        └──────┬──────┬──────┬─────────┘
                               │      │      │
                               ▼      ▼      ▼
                          ┌────┐  ┌────┐  ┌────┐
                          │pg1 │  │pg2 │  │pg3 │  ← PostgreSQL 17 + Patroni 4
                          │    │  │    │  │    │
                          │etcd│  │etcd│  │etcd│  ← etcd 3.5 (DCS)
                          └────┘  └────┘  └────┘
                               │      │      │
                               └──────┴──────┘
                               ┌─────────────┐
                               │  S3 Bucket  │  ← pgbackrest 2.58 backups
                               └─────────────┘

4 hosts total:

Host IP Role
pg1 172.31.47.229 PostgreSQL primary (initially) + etcd
pg2 172.31.34.169 PostgreSQL replica + etcd
pg3 172.31.39.118 PostgreSQL replica + etcd
haproxy 172.31.10.50 HAProxy load balancer

Stack versions (May 2026):

Component Version Source
Rocky/AlmaLinux 9.x OS
PostgreSQL 17.x PGDG community repo
Patroni 4.1.3 pip (PyPI)
etcd 3.5.x GitHub binary release
pgbackrest 2.58.0 PGDG community repo
HAProxy 2.4+ EPEL / system repo

Part 1: OS Preparation (All PostgreSQL Nodes)

Run these steps on pg1, pg2, and pg3.

1.1 Hostname and /etc/hosts

# On pg1
hostnamectl set-hostname pg1

# On all nodes, add to /etc/hosts:
cat >> /etc/hosts << 'EOF'
172.31.47.229 pg1 pg1.example.com
172.31.34.169 pg2 pg2.example.com
172.31.39.118 pg3 pg3.example.com
EOF

1.2 SELinux and Firewall

For a controlled private network, disable both to reduce complexity:

setenforce 0
sed -i 's/^SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

systemctl stop firewalld
systemctl disable firewalld

If your security policy requires the firewall, open these ports instead:

firewall-cmd --permanent --add-port={2379,2380,5432,8008}/tcp
firewall-cmd --reload

1.3 Kernel Parameters

Add to /etc/sysctl.conf on all PostgreSQL nodes:

# Shared memory
kernel.shmall = 2097152
kernel.shmmax = 134217728
kernel.shmmni = 4096

# Memory overcommit — prevents OOM kills during large sorts
vm.overcommit_memory = 2
vm.overcommit_ratio = 80

# Network tuning
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.core.netdev_max_backlog = 10000
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 10

# File handles
fs.file-max = 65536

Apply immediately:

sysctl -p

1.4 Process Limits

Add to /etc/security/limits.conf:

postgres soft nofile 65536
postgres hard nofile 65536
postgres soft nproc 16384
postgres hard nproc 16384

Part 2: etcd Installation (All PostgreSQL Nodes)

etcd is the distributed configuration store (DCS) that Patroni uses for leader election and cluster state. Run a 3-node etcd cluster co-located on the PostgreSQL hosts.

2.1 Install etcd Binary

ETCD_VER=v3.5.21
DOWNLOAD_URL=https://github.com/etcd-io/etcd/releases/download

wget ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzf etcd-${ETCD_VER}-linux-amd64.tar.gz
sudo mv etcd-${ETCD_VER}-linux-amd64/etcd* /usr/local/bin/
rm -rf etcd-${ETCD_VER}-linux-amd64*

etcd --version   # verify

2.2 Create etcd User and Directories

sudo groupadd --system etcd
sudo useradd -s /sbin/nologin --system -g etcd etcd

sudo mkdir -p /var/lib/etcd /etc/etcd
sudo chown -R etcd:etcd /var/lib/etcd
sudo chmod 0700 /var/lib/etcd

2.3 etcd Configuration File

Create /etc/etcd/etcd.conf on each node — substitute the correct IP for each host.

pg1 (172.31.47.229):

cat > /etc/etcd/etcd.conf << 'EOF'
ETCD_NAME=pg1
ETCD_DATA_DIR=/var/lib/etcd
ETCD_LISTEN_PEER_URLS=http://172.31.47.229:2380
ETCD_LISTEN_CLIENT_URLS=http://172.31.47.229:2379,http://127.0.0.1:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://172.31.47.229:2380
ETCD_ADVERTISE_CLIENT_URLS=http://172.31.47.229:2379
ETCD_INITIAL_CLUSTER=pg1=http://172.31.47.229:2380,pg2=http://172.31.34.169:2380,pg3=http://172.31.39.118:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=pg-etcd-cluster-01
EOF

pg2 (172.31.34.169) — same content with pg2 values substituted.
pg3 (172.31.39.118) — same content with pg3 values substituted.

2.4 etcd systemd Service

cat > /etc/systemd/system/etcd.service << 'EOF'
[Unit]
Description=etcd distributed key-value store
Documentation=https://etcd.io/docs/
After=network-online.target
Wants=network-online.target

[Service]
User=etcd
Type=notify
EnvironmentFile=/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd
Restart=on-failure
RestartSec=10s
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable --now etcd

2.5 Verify etcd Cluster

# Set API version for v3 commands
export ETCDCTL_API=3

etcdctl --endpoints=http://127.0.0.1:2379 member list -w table
etcdctl --endpoints=http://172.31.47.229:2379,http://172.31.34.169:2379,http://172.31.39.118:2379 endpoint health -w table
etcdctl --endpoints=http://172.31.47.229:2379,http://172.31.34.169:2379,http://172.31.39.118:2379 endpoint status -w table

Expected output: all 3 members healthy, one showing IS LEADER=true.


Part 3: PostgreSQL Installation (All PostgreSQL Nodes)

3.1 Install PGDG Repository

The PGDG (PostgreSQL Global Development Group) repository is the official community source for all PostgreSQL versions. One RPM enables all supported versions.

# Install PGDG repo
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# Disable the built-in PostgreSQL module (conflicts with PGDG packages)
sudo dnf -qy module disable postgresql

# Install PostgreSQL 17
sudo dnf install -y postgresql17-server postgresql17-contrib

This installs PostgreSQL 17 with binaries at /usr/pgsql-17/bin/ and the default data directory at /var/lib/pgsql/17/data/.

3.2 Install pgbackrest

pgbackrest is available directly from the PGDG repo — no extra configuration needed:

sudo dnf install -y pgbackrest

3.3 Do NOT Run initdb Yet

Patroni will initialize the PostgreSQL data directory (initdb) on the first boot. Running it manually will conflict. Skip postgresql-17-setup initdb.


Part 4: Patroni Installation (All PostgreSQL Nodes)

Patroni 4.x is installed via pip for the latest version. It requires Python 3.9+.

4.1 Install Dependencies

sudo dnf install -y python3 python3-pip python3-devel gcc

4.2 Install Patroni

Install as the postgres system user with the etcd3 and psycopg3 extras:

sudo -u postgres pip3 install --user 'patroni[psycopg3,etcd3]'

# Add to postgres user's PATH
echo 'export PATH=$HOME/.local/bin:/usr/pgsql-17/bin:$PATH' >> /var/lib/pgsql/.bash_profile
source /var/lib/pgsql/.bash_profile

patroni --version   # should show Patroni 4.1.x

Why psycopg3?

Patroni 4.x defaults to psycopg3 (the modern PostgreSQL adapter). The older psycopg2 still works but psycopg3 offers better async support and async connection handling — the recommended choice for new installations.


Part 5: Patroni Configuration (All PostgreSQL Nodes)

Create /var/lib/pgsql/patroni.yml on each node. The only values that differ per node are name, restapi.listen, restapi.connect_address, postgresql.listen, and postgresql.connect_address.

5.1 Patroni Configuration File

pg1 (172.31.47.229):

scope: pg_cluster
name: pg1

restapi:
  listen: '172.31.47.229:8008'
  connect_address: '172.31.47.229:8008'

# etcd3 section — uses etcd v3 API (recommended for etcd 3.5+)
etcd3:
  hosts: 'pg1:2379,pg2:2379,pg3:2379'
  protocol: http

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        hot_standby: 'on'
        wal_keep_size: 256MB        # replaces deprecated wal_keep_segments
        max_wal_senders: 10
        max_replication_slots: 10
        wal_level: replica
        archive_command: 'pgbackrest --stanza=pg_cluster archive-push %p'
        archive_mode: 'on'

  initdb:
    - encoding: UTF8
    - locale: en_US.UTF-8
    - data-checksums            # enables page checksums — catches data corruption

  pg_hba:
    - local all all trust
    - host all postgres 0.0.0.0/0 trust
    - host replication replicator 172.31.47.229/32 scram-sha-256
    - host replication replicator 172.31.34.169/32 scram-sha-256
    - host replication replicator 172.31.39.118/32 scram-sha-256
    - host all all 0.0.0.0/0 scram-sha-256

  users:
    admin:
      password: 'ChangeMe2026!'
      options:
        - createrole
        - createdb

postgresql:
  listen: '172.31.47.229:5432'
  connect_address: '172.31.47.229:5432'
  bin_dir: /usr/pgsql-17/bin
  data_dir: /var/lib/pgsql/17/data
  pgpass: /tmp/pgpass0

  authentication:
    replication:
      username: replicator
      password: 'Rep-Pass-2026!'
    superuser:
      username: postgres
      password: 'PgSuperPass2026!'
    rewind:
      username: rewind_user
      password: 'Rewind-Pass-2026!'

  create_replica_methods:
    - pgbackrest
    - basebackup

  pgbackrest:
    command: 'pgbackrest --stanza=pg_cluster --delta restore'
    keep_data: true
    no_params: true

  basebackup:
    checkpoint: fast

tags:
  nofailover: false
  noloadbalance: false
  clonefrom: false
  nosync: false

pg2 and pg3: same file, change name: to pg2/pg3 and update both IP addresses in restapi and postgresql sections.

Change all passwords

Replace all placeholder passwords before deploying. Use a secrets manager or Ansible vault in production — do not store plaintext credentials in config files checked into git.

5.2 Patroni systemd Service

cat > /etc/systemd/system/patroni.service << 'EOF'
[Unit]
Description=Patroni — HA PostgreSQL orchestrator
After=syslog.target network.target etcd.service
Wants=etcd.service

[Service]
Type=simple
User=postgres
Group=postgres
Environment=PATH=/var/lib/pgsql/.local/bin:/usr/pgsql-17/bin:/usr/local/bin:/usr/bin:/bin
Environment=PGDATA=/var/lib/pgsql/17/data
ExecStart=/var/lib/pgsql/.local/bin/patroni /var/lib/pgsql/patroni.yml
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=process
TimeoutSec=30
Restart=on-failure
RestartSec=10s

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable --now patroni

5.3 Verify the Cluster

Watch the logs as Patroni initializes. On the first start, one node becomes the primary and initializes the data directory:

journalctl -u patroni -f

Check cluster status:

patronictl -c /var/lib/pgsql/patroni.yml list

Expected output:

+ Cluster: pg_cluster (7234567890123456789) ----+----+-----------+
| Member | Host            | Role    | State   | TL | Lag in MB |
+--------+-----------------+---------+---------+----+-----------+
| pg1    | 172.31.47.229:5432 | Leader  | running |  1 |           |
| pg2    | 172.31.34.169:5432 | Replica | running |  1 |         0 |
| pg3    | 172.31.39.118:5432 | Replica | running |  1 |         0 |
+--------+-----------------+---------+---------+----+-----------+

Part 6: pgbackrest Configuration (All PostgreSQL Nodes)

6.1 Create Directories

sudo mkdir -p /var/log/pgbackrest /etc/pgbackrest /var/lib/pgbackrest
sudo chmod 0750 /var/lib/pgbackrest
sudo chmod 770 /var/log/pgbackrest
sudo touch /etc/pgbackrest/pgbackrest.conf
sudo chmod 640 /etc/pgbackrest/pgbackrest.conf
sudo chown -R postgres:postgres /var/lib/pgbackrest /var/log/pgbackrest /etc/pgbackrest

6.2 pgbackrest Configuration

/etc/pgbackrest/pgbackrest.conf — same content on all three nodes:

[pg_cluster]
pg1-path=/var/lib/pgsql/17/data
pg1-port=5432
pg1-user=postgres

[global]
start-fast=y
process-max=4
archive-timeout=300
compress-type=lz4          # lz4 is faster than gz; available in pgbackrest 2.38+

repo1-type=s3
repo1-path=/pgbackrest/pg_cluster
repo1-retention-full=4
repo1-retention-diff=14

repo1-s3-bucket=your-pgbackrest-bucket
repo1-s3-endpoint=s3.ap-southeast-1.amazonaws.com
repo1-s3-key=YOUR_ACCESS_KEY
repo1-s3-key-secret=YOUR_SECRET_KEY
repo1-s3-region=ap-southeast-1
repo1-s3-uri-style=path

log-level-console=info
log-level-file=detail
log-path=/var/log/pgbackrest

[global:archive-push]
compress-level=3

6.3 Initialize and Verify

Run on the primary (pg1) only:

sudo -u postgres pgbackrest --stanza=pg_cluster stanza-create
sudo -u postgres pgbackrest --stanza=pg_cluster check
sudo -u postgres pgbackrest --stanza=pg_cluster backup --type=full
sudo -u postgres pgbackrest info

6.4 Schedule Backups

Add to the postgres user's crontab (crontab -u postgres -e):

# Full backup every Sunday at 01:00
0 1 * * 0  pgbackrest --type=full --stanza=pg_cluster backup
# Differential backup Mon-Sat at 01:00
0 1 * * 1-6  pgbackrest --type=diff --stanza=pg_cluster backup
# Incremental backup every 4 hours
0 */4 * * *  pgbackrest --type=incr --stanza=pg_cluster backup

Part 7: HAProxy Load Balancer (haproxy host)

7.1 Install HAProxy

sudo dnf install -y haproxy

7.2 HAProxy Configuration

/etc/haproxy/haproxy.cfg:

global
    maxconn 1000
    log /dev/log local0

defaults
    log global
    mode tcp
    retries 2
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s

# Statistics dashboard
listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /
    stats refresh 10s
    stats show-node

# Write traffic → primary only
# Patroni REST API: GET /primary returns 200 only on the current leader
listen postgres_write
    bind *:5000
    option httpchk GET /primary
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server pg1 172.31.47.229:5432 maxconn 100 check port 8008
    server pg2 172.31.34.169:5432 maxconn 100 check port 8008
    server pg3 172.31.39.118:5432 maxconn 100 check port 8008

# Read traffic → replicas, with primary as fallback
# Patroni REST API: GET /replica returns 200 only on replicas
listen postgres_read
    bind *:5001
    balance leastconn
    option httpchk GET /replica
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server pg1 172.31.47.229:5432 maxconn 100 check port 8008
    server pg2 172.31.34.169:5432 maxconn 100 check port 8008
    server pg3 172.31.39.118:5432 maxconn 100 check port 8008

Health check endpoint — 2026 update

The 2024 guide used OPTIONS /master. Patroni 4.x keeps /master as a compatibility alias but the canonical endpoints are /primary (for the leader) and /replica (for standbys). Use GET not OPTIONS — HAProxy's option httpchk GET /primary is cleaner and matches standard HTTP semantics.

7.3 Start HAProxy

systemctl enable --now haproxy

Monitor via the stats dashboard at http://172.31.10.50:7000.


Part 8: Testing

8.1 Verify Replication

Create data on the primary and read it from replicas:

# On any host — connects through HAProxy write port
psql -h 172.31.10.50 -p 5000 -U postgres << 'SQL'
CREATE DATABASE demo;
\c demo
CREATE TABLE orders (id serial, product text, created_at timestamptz DEFAULT now());
INSERT INTO orders (product) VALUES ('widget'), ('gadget'), ('doohickey');
SELECT * FROM orders;
SQL

# Read through HAProxy read port (hits a replica)
psql -h 172.31.10.50 -p 5001 -U postgres -d demo -c "SELECT * FROM orders;"

8.2 Test Automatic Failover

Stop Patroni on the current primary and watch the cluster elect a new leader:

# On pg1 (the current primary)
sudo systemctl stop patroni

# On pg2 or pg3 — watch the election happen in seconds
watch -n2 "patronictl -c /var/lib/pgsql/patroni.yml list"

Patroni will elect a new primary (typically within 30 seconds — the ttl setting). HAProxy detects the change via health checks within ~10 seconds. Writes automatically route to the new primary.

Restart the old primary — it rejoins as a replica:

sudo systemctl start patroni

8.3 Planned Switchover

For maintenance or rolling upgrades, trigger a controlled switchover:

patronictl -c /var/lib/pgsql/patroni.yml switchover pg_cluster

Patroni asks which node to promote, waits for replicas to catch up, and performs a zero-data-loss switchover.


Part 9: Day-2 Operations

Enable Synchronous Replication

For workloads where zero data loss is mandatory:

patronictl -c /var/lib/pgsql/patroni.yml edit-config --apply - --force << 'JSON'
{
  "synchronous_mode": true,
  "synchronous_mode_strict": true,
  "postgresql": {
    "parameters": {
      "synchronous_commit": "on"
    }
  }
}
JSON
patronictl -c /var/lib/pgsql/patroni.yml reload pg_cluster

Reinitialize a Corrupted Replica

If a replica's data directory is corrupted or severely behind:

patronictl -c /var/lib/pgsql/patroni.yml reinit pg_cluster pg2

Patroni will recreate the replica from pgbackrest (fast path) or pg_basebackup (fallback).

Change PostgreSQL Parameters

Never edit postgresql.conf directly on HA clusters. Use Patroni to distribute changes:

patronictl -c /var/lib/pgsql/patroni.yml edit-config

This opens your $EDITOR with the current DCS configuration. Changes are applied cluster-wide and persisted in etcd.

Restore from pgbackrest Backup

To restore to a point in time on a new cluster:

sudo -u postgres pgbackrest --stanza=pg_cluster \
  --delta \
  --target="2026-05-25 14:30:00" \
  --target-action=promote \
  restore

Part 10: Disaster Recovery

HA protects you from node failures within a single site — a dead disk, a crashed process, a bad kernel update. It does not protect you from a site-level event: a failed data center, a cloud region outage, or catastrophic data corruption that replicates to all replicas before anyone notices.

DR requires a separate site that can be promoted to production independently of the primary cluster.

This section covers two complementary approaches:

Approach RTO RPO Complexity
Patroni Standby Cluster (warm standby) 2–5 min seconds medium
pgbackrest PITR Restore (cold standby) 30–120 min up to 4 hours low

Use both: the standby cluster for planned or fast failover, pgbackrest restore as the backstop if the standby cluster itself is unavailable.


10.1 DR Architecture

PRIMARY SITE (ap-southeast-1)           DR SITE (ap-southeast-2)
─────────────────────────────           ──────────────────────────
  HAProxy (:5000 write)                   HAProxy-DR (:5000 write)
     │                                         │
  pg1 (Leader) ──streaming──────────────► dr-pg1 (standby leader)
  pg2 (Replica)                           dr-pg2 (standby replica)
  pg3 (Replica)                           dr-pg3 (standby replica)
     │                                         │
     └──── pgbackrest ────► S3 (primary) ──S3 replication──► S3 (DR)

The DR cluster streams from the primary cluster's current leader via a dedicated replication slot. On DR activation, the standby cluster is promoted and becomes an independent primary cluster in minutes.


10.2 Prepare the Primary Cluster for DR

The primary cluster needs a permanent replication slot that the DR standby cluster will hold. Add this to the bootstrap.dcs section of patroni.yml on all primary-site nodes and apply via edit-config:

patronictl -c /var/lib/pgsql/patroni.yml edit-config --apply - --force << 'JSON'
{
  "slots": {
    "patroni_standby_leader": {
      "type": "physical"
    }
  }
}
JSON

Verify the slot was created on the primary:

psql -U postgres -c "SELECT slot_name, slot_type, active FROM pg_replication_slots;"

Monitor this slot

A replication slot that is not consumed causes WAL accumulation on the primary. If the DR cluster is disconnected for a long time, the primary's disk fills with retained WAL. Monitor pg_replication_slots.wal_status — if it shows lost, reinitialize the DR cluster from a pgbackrest backup.


10.3 DR Site: Node Setup

Provision 3 new nodes in the DR site. They mirror the primary site setup with different IPs:

Host IP Role
dr-pg1 172.32.47.229 Standby leader (streams from primary)
dr-pg2 172.32.34.169 Standby replica
dr-pg3 172.32.39.118 Standby replica
dr-haproxy 172.32.10.50 HAProxy (serves reads only during standby)

Run Part 1 through Part 4 of this guide on the DR nodes (OS tuning, etcd, PostgreSQL, Patroni installation). The only difference is the Patroni configuration file.


10.4 DR Patroni Configuration

The DR cluster's patroni.yml is identical to the primary cluster's except for the standby_cluster block and the bootstrap.dcs section. The standby_cluster block tells Patroni this cluster should follow the primary, not elect its own leader.

/var/lib/pgsql/patroni.yml on dr-pg1:

scope: dr_cluster          # different scope name from primary
name: dr-pg1

restapi:
  listen: '172.32.47.229:8008'
  connect_address: '172.32.47.229:8008'

etcd3:
  hosts: 'dr-pg1:2379,dr-pg2:2379,dr-pg3:2379'   # DR-site etcd cluster
  protocol: http

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576

    # This block declares the cluster is a standby of another cluster
    standby_cluster:
      host: 172.31.10.50             # primary site HAProxy write port
      port: 5000
      primary_slot_name: patroni_standby_leader   # slot created in step 10.2
      create_replica_methods:
        - pgbackrest
        - basebackup

    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        hot_standby: 'on'
        wal_keep_size: 256MB
        max_wal_senders: 10
        max_replication_slots: 10
        wal_level: replica
        archive_command: 'pgbackrest --stanza=pg_cluster archive-push %p'
        archive_mode: 'on'

  pg_hba:
    - local all all trust
    - host all postgres 0.0.0.0/0 trust
    - host replication replicator 172.32.47.229/32 scram-sha-256
    - host replication replicator 172.32.34.169/32 scram-sha-256
    - host replication replicator 172.32.39.118/32 scram-sha-256
    - host all all 0.0.0.0/0 scram-sha-256

postgresql:
  listen: '172.32.47.229:5432'
  connect_address: '172.32.47.229:5432'
  bin_dir: /usr/pgsql-17/bin
  data_dir: /var/lib/pgsql/17/data
  pgpass: /tmp/pgpass0

  authentication:
    replication:
      username: replicator
      password: 'Rep-Pass-2026!'      # must match primary cluster
    superuser:
      username: postgres
      password: 'PgSuperPass2026!'
    rewind:
      username: rewind_user
      password: 'Rewind-Pass-2026!'

  create_replica_methods:
    - pgbackrest
    - basebackup

  pgbackrest:
    command: 'pgbackrest --stanza=pg_cluster --delta restore'
    keep_data: true
    no_params: true

  basebackup:
    checkpoint: fast

Start Patroni on all DR nodes. The DR cluster will bootstrap from the primary via pgbackrest, then switch to streaming replication.


10.5 Verify the Standby Cluster

On the DR site:

patronictl -c /var/lib/pgsql/patroni.yml list

Expected output — note the standby_leader role:

+ Cluster: dr_cluster ─────────────────+─────────+────────────+
| Member  | Host            | Role            | State   | Lag |
+---------+-----------------+-----------------+---------+-----+
| dr-pg1  | 172.32.47.229   | Standby Leader  | running |   0 |
| dr-pg2  | 172.32.34.169   | Replica         | running |   0 |
| dr-pg3  | 172.32.39.118   | Replica         | running |   0 |
+---------+-----------------+-----------------+---------+-----+

All nodes are in read-only standby mode. Write attempts will fail — this is correct behavior.

Monitor replication lag from the primary site:

psql -U postgres -c "
  SELECT application_name, state, sent_lsn, replay_lsn,
         (sent_lsn - replay_lsn) AS lag_bytes
  FROM pg_stat_replication
  WHERE application_name LIKE 'dr%';
"

10.6 pgbackrest in the DR Site

Configure pgbackrest on DR nodes to point to the same S3 bucket (or a replicated DR bucket). Add a second repository pointing to the DR region:

/etc/pgbackrest/pgbackrest.conf on DR nodes:

[pg_cluster]
pg1-path=/var/lib/pgsql/17/data
pg1-port=5432
pg1-user=postgres

[global]
start-fast=y
process-max=4
archive-timeout=300
compress-type=lz4

# Primary S3 repo (cross-region accessible)
repo1-type=s3
repo1-path=/pgbackrest/pg_cluster
repo1-retention-full=4
repo1-s3-bucket=your-pgbackrest-bucket
repo1-s3-endpoint=s3.ap-southeast-1.amazonaws.com
repo1-s3-key=YOUR_ACCESS_KEY
repo1-s3-key-secret=YOUR_SECRET_KEY
repo1-s3-region=ap-southeast-1
repo1-s3-uri-style=path

# DR S3 repo (local region — faster restore on DR activation)
repo2-type=s3
repo2-path=/pgbackrest/pg_cluster
repo2-retention-full=4
repo2-s3-bucket=your-pgbackrest-dr-bucket
repo2-s3-endpoint=s3.ap-southeast-2.amazonaws.com
repo2-s3-key=YOUR_DR_ACCESS_KEY
repo2-s3-key-secret=YOUR_DR_SECRET_KEY
repo2-s3-region=ap-southeast-2
repo2-s3-uri-style=path

log-level-console=info
log-level-file=detail
log-path=/var/log/pgbackrest

Enable S3 cross-region replication from your primary bucket to the DR bucket in AWS — this happens transparently at the storage layer and keeps the DR repo in sync without any pgbackrest configuration.


10.7 DR Activation Runbook

When the primary site is unavailable and you need to promote the DR cluster to production, follow this sequence:

Step 1 — Confirm the primary site is truly down

Do not activate DR on a false alarm. Verify:

# Try connecting to primary HAProxy
psql -h 172.31.10.50 -p 5000 -U postgres -c "SELECT 1;" --connect-timeout=5

# Check Patroni REST API on primary nodes
curl -s http://172.31.47.229:8008/primary
curl -s http://172.31.34.169:8008/primary
curl -s http://172.31.39.118:8008/primary

If all return errors or timeouts, proceed.

Step 2 — Check DR cluster lag

patronictl -c /var/lib/pgsql/patroni.yml list

If lag is 0, zero data loss. If lag is non-zero, the DR site has not replicated all transactions from the primary — document the lag before promoting (this is your RPO for this incident).

Step 3 — Promote the DR cluster

Remove the standby_cluster block from the DCS configuration to release the DR cluster from standby mode:

patronictl -c /var/lib/pgsql/patroni.yml edit-config --apply - --force << 'JSON'
{"standby_cluster": null}
JSON

Within seconds, Patroni detects the configuration change and elects a leader in the DR cluster:

watch -n2 "patronictl -c /var/lib/pgsql/patroni.yml list"

Expected: dr-pg1 (or whichever node wins election) shows Leader role.

Step 4 — Redirect application traffic

Update your DNS CNAME or load balancer VIP to point to the DR HAProxy:

# Example: update DNS record
# db.example.com → 172.32.10.50 (DR HAProxy)

Applications reconnect automatically on next connection attempt.

Step 5 — Verify writes succeed

psql -h 172.32.10.50 -p 5000 -U postgres -c "
  SELECT pg_is_in_recovery(), now();
  CREATE TABLE IF NOT EXISTS dr_test (ts timestamptz DEFAULT now());
  INSERT INTO dr_test DEFAULT VALUES;
  SELECT * FROM dr_test;
"

pg_is_in_recovery() must return false — the cluster is now a writable primary.


10.8 DR via pgbackrest Restore (Cold Standby)

If the standby cluster is also unavailable (total regional failure), restore from the S3 backup in the DR region:

# On new DR infrastructure — after installing PostgreSQL and pgbackrest

# List available backups in DR repo
sudo -u postgres pgbackrest --stanza=pg_cluster --repo=2 info

# Restore latest backup
sudo -u postgres pgbackrest --stanza=pg_cluster \
  --repo=2 \
  --delta \
  --target-action=promote \
  restore

# Start PostgreSQL
sudo systemctl start postgresql-17

# Verify recovery completed
psql -U postgres -c "SELECT pg_is_in_recovery(), pg_last_xact_replay_timestamp();"

Then initialize Patroni on the restored node as a new primary cluster (remove the standby_cluster block from patroni.yml).

For point-in-time recovery to a specific moment before a data corruption event:

sudo -u postgres pgbackrest --stanza=pg_cluster \
  --repo=2 \
  --delta \
  --type=time \
  --target="2026-05-26 08:45:00 UTC" \
  --target-action=promote \
  restore

10.9 DR Testing Schedule

A DR plan that has never been tested is not a DR plan — it is a wish. Run these tests on a regular cadence:

Test Frequency What it proves
Verify standby lag < 5 seconds Daily (automated) Streaming replication is healthy
pgbackrest restore to isolated env Monthly Backups are actually restorable
Full DR activation (non-prod traffic) Quarterly Runbook works end-to-end
DNS/VIP cutover drill Quarterly Application teams know the procedure
Primary-site failure simulation Annually Full DR confidence

Automate the daily lag check:

# Add to crontab on monitoring host
*/5 * * * * psql -h 172.31.10.50 -p 5000 -U postgres -t -c \
  "SELECT COALESCE(MAX(write_lag), '0')::text FROM pg_stat_replication \
   WHERE application_name LIKE 'dr%';" | \
  grep -v "^$" | awk '{if ($1 > 5) print "ALERT: DR lag " $1 " seconds"}'

Ports Reference

Primary site:

Port Host Purpose
2379 pg1/2/3 etcd client
2380 pg1/2/3 etcd peer (cluster comms)
5432 pg1/2/3 PostgreSQL
8008 pg1/2/3 Patroni REST API (health checks)
5000 haproxy Write port → primary
5001 haproxy Read port → replicas
7000 haproxy HAProxy stats dashboard

DR site (mirrors primary, different IPs):

Port Host Purpose
2379 dr-pg1/2/3 etcd client
2380 dr-pg1/2/3 etcd peer
5432 dr-pg1/2/3 PostgreSQL (read-only in standby mode)
8008 dr-pg1/2/3 Patroni REST API
5000 dr-haproxy Write port (active only after DR promotion)
5001 dr-haproxy Read port → DR replicas

Summary

Here is what each layer provides:

  1. etcd is the neutral arbitrator. It stores cluster state and leader lock — no PostgreSQL node makes a promotion decision without consensus from the etcd quorum.

  2. Patroni 4.x is the orchestration layer. It manages initialization, replication, health checks, and automatic failover. Its REST API exposes the current cluster topology, which HAProxy uses for routing.

  3. PostgreSQL 17 from PGDG installs cleanly on RHEL 9 / Rocky Linux with a single repo RPM and stays on the community release track — security patches and new features arrive immediately without waiting for a vendor rebuild.

  4. pgbackrest 2.58 handles point-in-time backups to S3 and doubles as Patroni's fast replica creation method — pg_basebackup from a running primary is slow; pgbackrest restores from the backup repository in parallel.

  5. HAProxy routes writes to the current primary (port 5000) and reads to replicas (port 5001) using Patroni's health check endpoints — applications get read/write splitting without code changes.

  6. Patroni Standby Cluster (Part 10) extends HA across sites. The DR cluster streams from the primary cluster's leader via a dedicated replication slot. A five-step runbook promotes it to an independent primary in under 5 minutes — without touching the primary site's etcd or Patroni configuration.

  7. pgbackrest PITR Restore is the safety net behind the standby cluster. If the standby is also unavailable, a clean restore from the DR-region S3 bucket delivers the database to any point in time, including the moment before a data corruption event propagated.

What this setup survives:

Event Recovery mechanism RTO
Single node failure Patroni automatic failover < 30 seconds
All primary-site nodes down DR standby cluster promotion 2–5 minutes
Data corruption (replicated) pgbackrest PITR restore 30–120 minutes
Full region loss pgbackrest restore from DR S3 30–120 minutes

The result is a database stack that handles node failures automatically, can survive a complete site loss, keeps point-in-time backups off-site, and costs nothing to license. Test the DR runbook quarterly — a plan that has never been executed is not a plan.


Questions or discussion? Connect on LinkedIn, X or reach out via email.

Discussion

Have thoughts on this post? Share them below — questions, corrections, or your own experience are all welcome.