Introduction
Data replication is one of the most powerful tools in a database engineer's arsenal. Whether you need to distribute read traffic across multiple servers, keep a standby database warm for failover, or stream changes to a data warehouse for analytics, replication is the mechanism that makes it possible. PostgreSQL's logical replication, introduced in version 10, represents a fundamental shift from the older physical replication model. Instead of copying raw binary data blocks between servers, logical replication operates at the level of individual data changes—INSERTs, UPDATEs, and DELETEs—and can selectively replicate specific tables, transform data in flight, and even bridge different PostgreSQL versions.
Why does this matter for production systems? Physical replication creates an exact byte-for-byte copy of the primary server. You cannot replicate a single database, a single table, or a single schema. You cannot use the replica for writes. You cannot replicate across different PostgreSQL major versions. Logical replication removes all of these constraints. It gives you fine-grained control over what gets replicated, where it goes, and how conflicts are resolved.
This guide covers everything you need to know to implement logical replication in production: the architecture and internal mechanics, step-by-step setup of publications and subscriptions, conflict resolution strategies, monitoring, performance tuning, and real-world use cases including zero-downtime migrations, read replicas for specific workloads, and data distribution across regions.
Understanding Logical Replication: Core Concepts
Physical vs Logical Replication
To understand what makes logical replication special, you need to understand what physical replication does and where it falls short.
Physical replication (also called streaming replication or binary replication) works by shipping the WAL (Write-Ahead Log) from the primary server to one or more standby servers. The standby applies the WAL blocks directly to its own data files, producing an exact copy of the primary. This is fast, simple, and battle-tested, but it has significant limitations:
- The standby is read-only. You cannot write to it, even to create an index or add a column.
- You must replicate the entire cluster. You cannot replicate a single database or table.
- Primary and standby must run the same PostgreSQL major version.
- The standby cannot have any data that does not exist on the primary.
Logical replication works at a higher level. Instead of shipping raw WAL blocks, the primary decodes the WAL into logical changes (row-level INSERT, UPDATE, DELETE operations) and sends those changes to subscribers. The subscriber applies the changes using standard SQL operations. This fundamental difference enables all the flexibility that logical replication offers.
Publications and Subscriptions
Logical replication in PostgreSQL is organized around two abstractions: publications and subscriptions.
A publication is defined on the publisher (source) database. It specifies which tables should be replicated and what kinds of operations (INSERT, UPDATE, DELETE, TRUNCATE) to include. A publication can cover all tables, a specific list of tables, or all tables in a schema (PostgreSQL 15+).
A subscription is defined on the subscriber (target) database. It connects to the publisher, subscribes to one or more publications, and starts receiving and applying changes. A subscription can combine multiple publications from different publishers, giving you flexibility in how data flows.
The Replication Slot
When a subscriber connects, the publisher creates a replication slot to track the subscriber's progress. The slot remembers which WAL data has been sent and acknowledged by the subscriber, ensuring that no changes are lost even if the subscriber temporarily disconnects. This is critical for reliability but also introduces a risk: if a subscriber is down for an extended period, the replication slot prevents WAL cleanup on the publisher, potentially causing disk space exhaustion. We will cover monitoring and management strategies for this.
Initial Data Synchronization
When a subscription is first created, PostgreSQL performs an initial data copy (called the table sync phase). For each table in the publication, the publisher takes a consistent snapshot of the data and sends it to the subscriber. During this initial sync, any new changes on the publisher are queued. Once the initial copy is complete, the subscriber catches up with the queued changes and transitions to continuous streaming replication.
This initial sync can be resource-intensive on large tables. PostgreSQL supports parallel table synchronization (controlled by max_sync_workers_per_subscription) to speed up the process.
Architecture and Design Patterns
Hub-and-Spoke Pattern
In this pattern, one central publisher distributes data to multiple subscribers. This is useful for distributing read workload across multiple servers or regions:
-- On the central publisher
CREATE PUBLICATION hub_pub FOR TABLE orders, customers, products;
-- On each spoke subscriber
CREATE SUBSCRIPTION spoke_1_sub
CONNECTION 'host=hub-server dbname=mydb user=replicator'
PUBLICATION hub_pub;Bidirectional Replication
Logical replication is inherently unidirectional by default. For bidirectional replication (where both sides can accept writes), you need two separate publication/subscription pairs flowing in opposite directions, plus conflict resolution logic:
-- On server A
CREATE PUBLICATION a_to_b FOR TABLE shared_data;
CREATE SUBSCRIPTION a_from_b
CONNECTION 'host=server-b dbname=mydb user=replicator'
PUBLICATION b_to_a;
-- On server B
CREATE PUBLICATION b_to_a FOR TABLE shared_data;
CREATE SUBSCRIPTION b_from_a
CONNECTION 'host=server-a dbname=mydb user=replicator'
PUBLICATION a_to_b;Bidirectional replication requires careful conflict handling. If both servers update the same row simultaneously, one update will win and the other will be silently dropped or cause an error. PostgreSQL 15+ introduced built-in conflict resolution, and for earlier versions, you need application-level logic or trigger-based conflict detection.
Fan-In Pattern
Multiple publishers send data to a single subscriber. This is useful for aggregating data from multiple sources into a central data warehouse:
-- On the central subscriber, create subscriptions to each source
CREATE SUBSCRIPTION from_region_us
CONNECTION 'host=us-server dbname=mydb user=replicator'
PUBLICATION region_us_pub;
CREATE SUBSCRIPTION from_region_eu
CONNECTION 'host=eu-server dbname=mydb user=replicator'
PUBLICATION region_eu_pub;Cascade Replication
Changes flow from publisher to subscriber, which then acts as a publisher for downstream subscribers. This is useful for reducing the load on the primary publisher when you have many subscribers:
-- On the intermediate server (subscriber + publisher)
CREATE SUBSCRIPTION from_primary
CONNECTION 'host=primary dbname=mydb user=replicator'
PUBLICATION primary_pub;
-- Make the intermediate server also a publisher
ALTER SUBSCRIPTION from_primary SET (origin = 'none');
CREATE PUBLICATION cascade_pub FOR TABLE orders, customers;The origin = 'none' setting is critical: it tells the intermediate server not to forward changes it received via replication, preventing infinite loops.
Step-by-Step Implementation
Prerequisites
Logical replication requires specific configuration on both the publisher and subscriber.
On the publisher (postgresql.conf):
wal_level = logical # Required for logical decoding
max_replication_slots = 10 # Number of replication slots
max_wal_senders = 10 # Number of concurrent WAL sender processesOn the publisher (pg_hba.conf):
# Allow replication connections from subscriber
host mydb replicator subscriber-ip/32 scram-sha-256
On both servers:
-- Create a dedicated replication user
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO replicator;After making configuration changes, restart or reload PostgreSQL:
SELECT pg_reload_conf();
-- Or restart if wal_level changedStep 1: Create the Publication
On the publisher database, create a publication. You can publish all tables, specific tables, or all tables in a schema:
-- Publish all tables in the database
CREATE PUBLICATION my_pub FOR ALL TABLES;
-- Publish specific tables only
CREATE PUBLICATION my_pub FOR TABLE orders, customers, products;
-- Publish all tables in a specific schema (PostgreSQL 15+)
CREATE PUBLICATION my_pub FOR TABLES IN SCHEMA sales;Step 2: Create the Subscription
On the subscriber database, create the matching table structures first, then create the subscription:
-- On subscriber: create the table structure (must exist before subscribing)
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY,
customer_id BIGINT NOT NULL,
total_cents BIGINT NOT NULL,
status VARCHAR(20) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Create the subscription
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=publisher-server dbname=mydb user=replicator password=secure_password'
PUBLICATION my_pub;Step 3: Monitor Initial Synchronization
Check the status of the subscription and table synchronization:
-- On subscriber: check subscription status
SELECT * FROM pg_stat_subscription;
-- Check per-table sync status
SELECT * FROM pg_subscription_rel;
-- On publisher: check replication slot status
SELECT * FROM pg_replication_slots;The pg_subscription_rel view shows the sync state of each table: i (init), d (data copy), s (synchronized), r (ready/streaming). Tables transition through these states as they complete initial sync.
Step 4: Verify Replication
Insert some data on the publisher and verify it appears on the subscriber:
-- On publisher
INSERT INTO orders (customer_id, total_cents, status)
VALUES (1, 4999, 'pending');
-- On subscriber (after a brief delay)
SELECT * FROM orders;Real-World Use Cases
Use Case 1: Zero-Downtime Database Migration
When migrating from one PostgreSQL server to another (e.g., upgrading hardware, changing cloud providers, or upgrading major versions), logical replication enables near-zero downtime:
- Set up the new server as a subscriber to the old server
- Wait for initial data sync to complete
- Both servers are now in sync
- Briefly stop writes on the old server (or use a proxy)
- Switch application connection strings to the new server
- Resume writes on the new server
This approach typically results in seconds of downtime rather than hours.
Use Case 2: Read Replicas for Specific Workloads
Instead of replicating the entire database to a read replica, you can replicate only the tables needed for a specific workload:
-- Publish only analytics-relevant tables
CREATE PUBLICATION analytics_pub FOR TABLE events, pageviews, sessions;
-- On the analytics replica
CREATE SUBSCRIPTION analytics_sub
CONNECTION 'host=primary dbname=mydb user=replicator'
PUBLICATION analytics_pub;This reduces replication overhead on the primary and storage requirements on the replica.
Use Case 3: Data Distribution Across Regions
For applications with users in multiple regions, you can replicate specific data subsets to regional servers:
-- On the US region server, subscribe to global product catalog
CREATE SUBSCRIPTION us_products_sub
CONNECTION 'host=global-server dbname=mydb user=replicator'
PUBLICATION global_products_pub;
-- Add a WHERE clause to filter (PostgreSQL 15+ with row filtering)
CREATE PUBLICATION us_orders_pub FOR TABLE orders WHERE (region = 'US');Best Practices for Production
-
Monitor replication lag continuously: Set up alerting on replication lag using
pg_stat_subscription.received_lsnandpg_stat_replication.replay_lag. A growing lag indicates the subscriber cannot keep up. -
Manage replication slots proactively: Inactive replication slots prevent WAL cleanup and can fill your disk. Use
pg_drop_replication_slot()to remove slots that are no longer needed, and monitorpg_replication_slotsfor slots with largerestart_lsngaps. -
Use dedicated replication users: Create a dedicated
replicatorrole with only the minimum required privileges:REPLICATIONandSELECTon published tables. -
Configure
max_sync_workers_per_subscription: During initial sync, PostgreSQL uses worker processes to copy table data in parallel. Increase this setting (default is 2) if you have many large tables to sync. -
Test failover scenarios: If the publisher fails and you promote a standby, you need to recreate subscriptions pointing to the new publisher. Test this process before you need it in production.
-
Use
ENABLED = falseduring setup: Create subscriptions withCREATE SUBSCRIPTION ... ENABLED = falseto configure them without starting replication immediately. This lets you verify the setup before data starts flowing. -
Set appropriate
wal_sender_timeoutandwal_receiver_timeout: These settings control how long the system waits before declaring a replication connection dead. Adjust them based on your network conditions. -
Consider
synchronous_commitsettings: Logical replication does not guarantee that changes are applied on the subscriber before the publisher transaction commits. If you need stronger guarantees, usesynchronous_commit = 'remote_apply'but be aware of the performance impact.
Common Pitfalls and Solutions
| Pitfall | Impact | Solution |
|---|---|---|
| Replication slot left inactive | WAL accumulates, fills disk | Monitor pg_replication_slots; drop unused slots; set max_slot_wal_keep_size |
| Subscriber table has different schema | Replication errors, data inconsistency | Ensure subscriber table structure matches publisher before subscribing |
| Primary key missing on subscriber table | UPDATE/DELETE operations fail | Always define primary keys on subscriber tables (or unique replica identity) |
| Large initial sync on busy tables | Performance impact on publisher | Schedule initial sync during low-traffic periods; increase max_sync_workers |
| Conflicting writes on both sides (multi-master) | Silent data loss or errors | Use conflict resolution (PG 15+) or application-level deduplication |
| Sequences not replicated | IDs conflict between publisher and subscriber | Sequences are not replicated; set different ranges or use UUIDs |
| DDL changes not replicated | Schema drift between publisher and subscriber | Apply DDL changes manually on both sides or use event triggers |
Performance Optimization
Tuning WAL Settings
Logical replication generates more WAL traffic than physical replication because the WAL must be decoded into logical changes:
-- Monitor WAL generation rate
SELECT * FROM pg_stat_wal;
-- Increase max_wal_size if WAL is being recycled too quickly
ALTER SYSTEM SET max_wal_size = '4GB';
SELECT pg_reload_conf();Reducing Replication Overhead
For high-throughput systems, logical replication can add measurable overhead. Reduce it by:
-- Replicate only the columns you need (PostgreSQL 15+)
CREATE PUBLICATION minimal_pub FOR TABLE orders (id, status, total_cents);
-- Use row filtering to reduce data volume (PostgreSQL 15+)
CREATE PUBLICATION active_orders_pub FOR TABLE orders WHERE (status != 'archived');Monitoring with pg_stat_replication
-- On publisher: see all active replication connections
SELECT
pid,
application_name,
client_addr,
state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
sync_state,
reply_time
FROM pg_stat_replication;
-- Calculate lag in bytes
SELECT
application_name,
pg_wal_lsn_diff(sent_lsn, replay_lsn) AS lag_bytes
FROM pg_stat_replication;Comparison with Alternatives
| Feature | Logical Replication | Physical Replication | pglogical Extension | External Tools (Debezium) |
|---|---|---|---|---|
| Selective table replication | Yes | No | Yes | Yes |
| Cross-version support | Yes | No | Yes | Yes |
| Built-in to PostgreSQL | Yes (10+) | Yes | No (extension) | No |
| Writeable subscriber | Yes | No | Yes | N/A |
| Conflict resolution | PG 15+ | N/A | Yes | Application-level |
| DDL replication | No | Yes (binary copy) | No | Partial |
| Row filtering | PG 15+ | No | Yes | Yes |
| Performance overhead | Moderate | Low | Moderate | High (CDC pipeline) |
Advanced Patterns
Combining with Triggers for Audit Logging
You can use logical replication to populate a subscriber with audit logging triggers:
-- On the subscriber (not replicated back)
CREATE OR REPLACE FUNCTION audit_changes()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO audit_log (table_name, operation, row_data, changed_at)
VALUES (TG_TABLE_NAME, TG_OP, row_to_json(NEW), NOW());
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER orders_audit
AFTER INSERT OR UPDATE OR DELETE ON orders
FOR EACH ROW EXECUTE FUNCTION audit_changes();This pattern offloads audit logging from the primary to the subscriber, reducing write overhead on the production database.
Using pgoutput for Custom Decoding
The default output plugin for logical replication is pgoutput. For custom change data capture (CDC) pipelines, you can use test_decoding or build a custom output plugin:
-- Test logical decoding
SELECT * FROM pg_logical_slot_get_changes('my_slot', NULL, NULL);Testing Strategies
Testing Replication Setup in Development
-- On publisher: verify publication exists
SELECT * FROM pg_publication;
SELECT * FROM pg_publication_tables;
-- On subscriber: verify subscription and sync status
SELECT subname, subenabled, subconninfo FROM pg_subscription;
SELECT srsubid, srrelid, srsubstate FROM pg_subscription_rel;
-- Force a sync check
ALTER SUBSCRIPTION my_sub DISABLE;
ALTER SUBSCRIPTION my_sub ENABLE;Simulating Failover
Test your failover procedure by:
- Stopping the publisher
- Promoting a physical standby (if applicable)
- Updating subscription connection strings:
ALTER SUBSCRIPTION my_sub CONNECTION 'host=new-publisher dbname=mydb user=replicator';- Verifying replication resumes
Future Outlook
PostgreSQL continues to enhance logical replication with each major release. PostgreSQL 16 added improvements to logical replication for partitioned tables and bidirectional replication. Future releases are expected to include built-in conflict resolution for all replication topologies, improved DDL replication support, and better integration with external CDC tools.
The trend toward distributed and multi-region architectures makes logical replication increasingly important. As PostgreSQL evolves, expect logical replication to become the default replication mechanism for most use cases, with physical replication reserved for disaster recovery and standby server scenarios.
Conclusion
PostgreSQL logical replication is a powerful, built-in mechanism for real-time data synchronization. It enables selective table replication, cross-version compatibility, zero-downtime migrations, and flexible data distribution architectures. The key takeaways are:
- Use logical replication when you need selective, fine-grained data replication
- Always monitor replication slots to prevent disk space exhaustion
- Ensure subscriber tables have primary keys (or custom replica identity) before subscribing
- Schedule initial data sync during low-traffic periods for large tables
- Use PostgreSQL 15+ features like row filtering and column lists to reduce replication overhead
- Test failover procedures before you need them in production
- Consider the trade-offs between logical replication and external CDC tools like Debezium
- Monitor lag continuously and alert on growing gaps between publisher and subscriber
Start by setting up logical replication between two development instances to understand the mechanics. Once comfortable, implement it in staging to test your specific workload and failover scenarios. The PostgreSQL documentation on logical replication is comprehensive and the best reference for production deployments.