Introduction
PostgreSQL 17, released in September 2024, continues the tradition of major releases that push the boundaries of what an open-source relational database can do. Every major PostgreSQL release brings a mix of performance improvements, new SQL features, and operational enhancements that make it easier to run PostgreSQL at scale. PostgreSQL 17 is no exception—it delivers some of the most requested features from the community, including incremental backup support, the long-awaited JSON_TABLE function, significant improvements to the MERGE command, and a host of query optimizer enhancements that make complex queries faster without any application changes.
For teams running PostgreSQL in production, every release represents an opportunity to reduce operational overhead, improve query performance, and adopt new SQL patterns that simplify application code. PostgreSQL 17 is particularly significant because it bridges the gap between PostgreSQL and commercial databases by implementing SQL/JSON standard features that application developers have been requesting for years.
This guide covers the most impactful features in PostgreSQL 17, explains how they work internally, provides practical examples for adopting them, and discusses the migration considerations for teams planning an upgrade.
Understanding PostgreSQL 17: Core Concepts
Incremental Backup with pg_basebackup
Before PostgreSQL 17, taking a backup of a large PostgreSQL cluster meant either running pg_basebackup for a full physical backup (which copies every data file regardless of whether it changed) or using pg_dump for a logical backup (which is slower and doesn't preserve the physical layout). For a 5TB database, a full pg_basebackup takes hours and transfers 5TB of data every time—even if only 10GB changed since the last backup.
PostgreSQL 17 introduces incremental backup support through pg_basebackup --incremental. This feature uses the WAL (Write-Ahead Log) to track which data blocks have changed since a previous backup. When you run an incremental backup, PostgreSQL only copies the blocks that have been modified, dramatically reducing backup time and storage requirements.
The mechanism works through a new concept called a "backup manifest." When you take a full backup, PostgreSQL generates a manifest file that records the checksum and LSN (Log Sequence Number) of every block in the database. When you later take an incremental backup against that full backup, PostgreSQL compares each block's current LSN with the manifest. Blocks that haven't changed since the full backup are skipped. The result is a delta backup that contains only the changed blocks, which can be applied to the full backup to restore the database to the incremental backup's point in time.
JSON_TABLE: Bridging SQL and JSON
The JSON_TABLE function is arguably the most requested SQL feature in recent PostgreSQL history. It allows you to extract data from a JSON document and present it as a virtual relational table that you can join, filter, and aggregate with standard SQL. This bridges the gap between document-oriented data storage and relational querying.
Before JSON_TABLE, working with JSON arrays in PostgreSQL required awkward combinations of json_array_elements() with lateral joins and multiple ->> operators. The resulting queries were hard to read, difficult to optimize, and didn't compose well with other SQL features.
MERGE Enhancements
The MERGE command, introduced in PostgreSQL 15, allows you to perform INSERT, UPDATE, or DELETE operations in a single statement based on a join condition. PostgreSQL 17 adds several enhancements including support for RETURNING clauses (which was a major gap in the original implementation), the ability to reference the target table in subqueries within the MERGE statement, and improved error handling with better constraint violation messages.
Query Optimizer Improvements
PostgreSQL 17 includes several optimizer improvements that make complex queries faster without application changes. Notable improvements include better estimation for queries with multiple range conditions, improved handling of IN lists with many values, more efficient plans for queries that combine GROUP BY with ORDER BY and LIMIT, and enhanced statistics for JSON and array columns.
Architecture and Design Patterns
Incremental Backup Architecture
The incremental backup architecture introduces a three-tier backup strategy:
Full backups serve as the base. You take a full backup periodically (weekly or monthly) using pg_basebackup. This backup includes the complete data directory and a backup manifest.
Incremental backups are taken more frequently (daily or hourly) against the most recent full backup. Each incremental backup only contains the changed blocks, making it fast to take and small to store.
WAL archiving captures the write-ahead log continuously, providing point-in-time recovery capability between backup points.
To restore from this backup chain, you use pg_combinebackup to merge the full backup with subsequent incremental backups, then replay WAL archives to reach the desired point in time.
SQL/JSON Processing Architecture
PostgreSQL 17's SQL/JSON implementation follows the ISO/IEC 9075-2 standard, which defines a set of SQL functions and clauses for working with JSON data. The key architectural decision is that JSON operations can participate in the query optimizer's planning process. When you write a JSON_TABLE expression, the optimizer can push predicates down into the JSON processing, avoiding the materialization of unnecessary rows.
This is a significant improvement over the pre-17 approach where json_array_elements() would always materialize the entire array before any filtering could happen. With JSON_TABLE, the optimizer understands the relationship between the JSON source and the virtual columns, enabling more efficient execution plans.
Step-by-Step Implementation
Setting Up Incremental Backups
First, configure WAL archiving (required for incremental backups):
-- postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/%f'Take a full backup with manifest:
pg_basebackup -D /backups/full -Ft -z --manifest-checksums=SHA256 \
--manifest-force-encode \
-X fetchTake an incremental backup against the full backup:
pg_basebackup -D /backups/incr1 -Ft -z \
--incremental=/backups/full/backup_manifest \
-X fetchCombine backups for restoration:
pg_combinebackup /backups/full /backups/incr1 -o /backups/combinedUsing JSON_TABLE
Extract structured data from JSON documents into a relational result set:
-- Sample data: orders with JSON line items
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
line_items JSONB NOT NULL
);
INSERT INTO orders (customer_id, order_date, line_items) VALUES
(1, '2024-09-01', '[{"sku":"A1","qty":2,"price":29.99},{"sku":"B2","qty":1,"price":49.99}]'),
(2, '2024-09-02', '[{"sku":"A1","qty":5,"price":29.99},{"sku":"C3","qty":3,"price":19.99}]');
-- Use JSON_TABLE to extract line items into a virtual table
SELECT
o.id AS order_id,
o.customer_id,
jt.sku,
jt.quantity,
jt.unit_price,
jt.quantity * jt.unit_price AS line_total
FROM orders o,
JSON_TABLE(
o.line_items,
'$[*]' COLUMNS (
sku TEXT PATH '$.sku',
quantity INT PATH '$.qty',
unit_price NUMERIC PATH '$.price'
)
) AS jt
WHERE jt.quantity > 1;
-- Result:
-- order_id | customer_id | sku | quantity | unit_price | line_total
-- ---------+-------------+-----+----------+------------+-----------
-- 1 | 1 | A1 | 2 | 29.99 | 59.98
-- 2 | 1 | A1 | 5 | 29.99 | 149.95
-- 2 | 2 | C3 | 3 | 19.99 | 59.97JSON_TABLE with Nested JSON and Error Handling
Handle nested JSON objects and provide defaults for missing fields:
SELECT
o.id,
jt.customer_name,
jt.shipping_city,
jt.shipping_country
FROM orders o,
JSON_TABLE(
o.metadata,
'$' COLUMNS (
customer_name TEXT PATH '$.customer.name' DEFAULT 'Unknown' ON EMPTY,
NESTED PATH '$.shipping_address[*]' COLUMNS (
city TEXT PATH '$.city',
country TEXT PATH '$.country' DEFAULT 'US' ON EMPTY,
zip TEXT PATH '$.zip'
)
)
ERROR ON ERROR
) AS jt;Enhanced MERGE with RETURNING
PostgreSQL 17's MERGE now supports RETURNING, which was the most notable gap in the original implementation:
-- Upsert inventory with full audit trail
MERGE INTO inventory AS target
USING (
VALUES
('SKU-A1', 100, CURRENT_DATE),
('SKU-B2', 50, CURRENT_DATE),
('SKU-C3', 200, CURRENT_DATE)
) AS source(sku, quantity, last_updated)
ON target.sku = source.sku
WHEN MATCHED AND target.quantity + source.quantity > 1000 THEN
DELETE
WHEN MATCHED THEN
UPDATE SET
quantity = target.quantity + source.quantity,
last_updated = source.last_updated
WHEN NOT MATCHED THEN
INSERT (sku, quantity, last_updated)
VALUES (source.sku, source.quantity, source.last_updated)
RETURNING
merge_action() AS action,
sku,
quantity,
last_updated;
-- Result:
-- action | sku | quantity | last_updated
-- --------+--------+----------+-------------
-- INSERT | SKU-A1 | 100 | 2024-09-10
-- UPDATE | SKU-B2 | 275 | 2024-09-10
-- INSERT | SKU-C3 | 200 | 2024-09-10Aggregating JSON Data with JSON_OBJECTAGG and JSON_ARRAYAGG
PostgreSQL 17 enhances JSON aggregate functions with improved performance and additional options:
-- Build a JSON summary of orders per customer
SELECT
c.name,
JSON_OBJECTAGG(
KEY o.order_date
VALUE JSON_OBJECT(
'order_id' VALUE o.id,
'total' VALUE o.total,
'item_count' VALUE o.item_count
)
ABSENT ON NULL
) AS orders_by_date
FROM customers c
JOIN orders o ON o.customer_id = c.id
GROUP BY c.name;Real-World Use Cases and Case Studies
Use Case 1: ETL Pipeline Simplification
A data engineering team processes millions of JSON events daily from an event streaming platform. Previously, they used a combination of json_array_elements() with lateral joins and multiple ->> extractions to parse event payloads. The queries were 50+ lines long and took minutes to execute on large batches. After upgrading to PostgreSQL 17 and adopting JSON_TABLE, the same queries are 15 lines long, run 3x faster due to optimizer pushdown, and are significantly easier for new team members to understand.
Use Case 2: Backup Cost Reduction for Large Databases
A SaaS company running a 2TB PostgreSQL database was spending 4 hours on each nightly backup and storing 2TB of backup data per day (14TB per week). After adopting incremental backups, nightly backups complete in 15-30 minutes (only 50-200GB of changed blocks), and weekly storage requirements dropped from 14TB to approximately 3TB (one full backup plus six incrementals).
Use Case 3: Complex Upsert Workflows
A financial application needs to process daily transaction reconciliation files that contain a mix of new transactions, updates to existing ones, and corrections that should trigger deletions. Before PostgreSQL 17's enhanced MERGE, this required separate INSERT, UPDATE, and DELETE statements wrapped in a transaction, with complex logic to determine which operation to apply. The MERGE statement with RETURNING consolidates this into a single, readable statement that also returns a log of what changed for the audit trail.
Best Practices for Production
-
Test incremental backup restoration before relying on it: Always verify your backup chain by restoring to a test instance. Run
pg_combinebackupand replay WAL to ensure the restored database is consistent and complete. -
Use JSON_TABLE for complex JSON extraction: Replace convoluted
json_array_elements()+ lateral join patterns withJSON_TABLE. It produces cleaner code and enables optimizer optimizations that the old approach cannot. -
Set appropriate manifest checksums for backups: Use SHA256 checksums in backup manifests for strong integrity verification. This adds minimal overhead to backup time but catches corruption early.
-
Index JSON paths used in JSON_TABLE predicates: Create GIN indexes on JSONB columns or expression indexes on frequently queried JSON paths to speed up
JSON_TABLEoperations. -
Monitor MERGE performance with EXPLAIN ANALYZE: The MERGE command can generate complex execution plans. Use
EXPLAIN ANALYZEto verify that the optimizer is using efficient join strategies and index scans. -
Plan your backup retention strategy: With incremental backups, you need to manage the chain of full + incremental backups. Keep the last full backup and all subsequent incrementals until the next full backup is verified.
-
Upgrade test environments first: Test your application queries against PostgreSQL 17 in a staging environment before production upgrade. The optimizer changes may produce different execution plans for existing queries.
-
Use the new
merge_action()function for auditing: In MERGE statements with RETURNING, callmerge_action()to identify whether each returned row was inserted, updated, or deleted. This is invaluable for audit logging and change data capture.
Common Pitfalls and Solutions
| Pitfall | Impact | Solution |
|---|---|---|
| Losing incremental backup chain | Cannot restore to latest point | Always keep the full backup manifest until the next full backup; use pg_verifybackup to validate chains |
| JSON_TABLE on unindexed JSONB columns | Slow full table scans | Create GIN indexes on JSONB columns or expression indexes on extracted paths |
| MERGE without proper unique constraints | Duplicate rows on concurrent inserts | Ensure the ON clause references columns with a unique constraint |
| Upgrading without testing optimizer changes | Slower queries after upgrade | Run EXPLAIN ANALYZE on critical queries in staging before production upgrade |
| Using JSON_TABLE when a normalized schema is better | Performance degradation at scale | Use JSON_TABLE for flexible/exploratory data; normalize when the schema is stable |
| Not archiving WAL before enabling incremental backup | Backup failures | Configure archive_mode = on and verify WAL archiving works before taking incremental backups |
Performance Optimization
Optimizing JSON_TABLE Queries
-- Create a GIN index on the JSONB column for faster filtering
CREATE INDEX idx_orders_line_items ON orders USING GIN (line_items);
-- Use a partial index for frequently queried JSON paths
CREATE INDEX idx_orders_sku_a1 ON orders
USING GIN ((line_items -> 'sku'))
WHERE line_items @> '[{"sku": "A1"}]';
-- Example query that benefits from the index
SELECT o.id, jt.sku, jt.qty
FROM orders o,
JSON_TABLE(o.line_items, '$[*]' COLUMNS (sku TEXT PATH '$.sku', qty INT PATH '$.qty')) AS jt
WHERE jt.sku = 'A1';Backup Performance Tuning
# Use parallel workers for faster backup
pg_basebackup -D /backups/full -Ft -z -X fetch \
--manifest-checksums=SHA256 \
--checkpoint=fast \
--max-rate=500M # Limit I/O rate to prevent production impactComparison with Alternatives
| Feature | PostgreSQL 17 | MySQL 8.0 | SQL Server 2022 | Oracle 23c |
|---|---|---|---|---|
| Incremental Backup | Native (pg_basebackup) | InnoDB redo log | Native | RMAN incremental |
| JSON_TABLE | Full SQL/JSON standard | JSON_TABLE (limited) | OPENJSON (non-standard) | Full SQL/JSON |
| MERGE with RETURNING | Yes (new in 17) | INSERT ON DUPLICATE KEY | MERGE with OUTPUT | MERGE with RETURNING |
| Backup Manifests | SHA256 checksums | N/A | Checksum verification | RMAN validation |
| SQL/JSON Compliance | High | Partial | Partial | High |
| Cost | Free (open-source) | Free + paid editions | Paid licenses | Paid licenses |
Advanced Patterns and Techniques
LATERAL + JSON_TABLE for Complex Transformations
Combine JSON_TABLE with LATERAL joins for multi-step JSON transformations:
SELECT
o.id,
jt.product_name,
jt.variants.name AS variant_name,
jt.variants.price AS variant_price
FROM orders o,
LATERAL (
SELECT * FROM JSON_TABLE(
o.line_items,
'$[*]' COLUMNS (
product_name TEXT PATH '$.name',
NESTED PATH '$.variants[*]' COLUMNS (
name TEXT PATH '$.name',
price NUMERIC PATH '$.price',
sku TEXT PATH '$.sku'
)
)
)
) AS jt
WHERE jt.variants.price > 20;Backup Rotation Script
Automate your backup rotation with a shell script:
#!/bin/bash
BACKUP_DIR="/backups"
FULL_BACKUP_INTERVAL_DAYS=7
LATEST_FULL=$(ls -td ${BACKUP_DIR}/full_* 2>/dev/null | head -1)
# Check if full backup is older than interval
if [ -z "$LATEST_FULL" ] || \
[ $(find "$LATEST_FULL" -maxdepth 0 -mtime +${FULL_BACKUP_INTERVAL_DAYS}) ]; then
echo "Taking full backup..."
pg_basebackup -D "${BACKUP_DIR}/full_$(date +%Y%m%d)" -Ft -z \
--manifest-checksums=SHA256 -X fetch
else
echo "Taking incremental backup against $(basename $LATEST_FULL)..."
pg_basebackup -D "${BACKUP_DIR}/incr_$(date +%Y%m%d_%H%M)" -Ft -z \
--incremental="${LATEST_FULL}/backup_manifest" -X fetch
fiTesting Strategies
Test your PostgreSQL 17 features with comprehensive integration tests:
-- Test JSON_TABLE extraction
DO $$
DECLARE
result_count INT;
BEGIN
SELECT COUNT(*) INTO result_count
FROM (VALUES ('[{"a":1},{"a":2},{"a":3}]'::jsonb)) AS t(data),
JSON_TABLE(t.data, '$[*]' COLUMNS (a INT PATH '$.a')) AS jt;
ASSERT result_count = 3, 'Expected 3 rows from JSON_TABLE';
RAISE NOTICE 'JSON_TABLE test passed: % rows', result_count;
END;
$$;
-- Test MERGE with RETURNING
CREATE TEMP TABLE target (id INT PRIMARY KEY, val TEXT);
INSERT INTO target VALUES (1, 'old'), (3, 'keep');
WITH result AS (
MERGE INTO target t
USING (VALUES (1, 'new'), (2, 'insert')) AS s(id, val)
ON t.id = s.id
WHEN MATCHED THEN UPDATE SET val = s.val
WHEN NOT MATCHED THEN INSERT VALUES (s.id, s.val)
RETURNING merge_action() AS action, id, val
)
SELECT * FROM result;
-- Expected: (UPDATE, 1, 'new'), (INSERT, 2, 'insert'), existing (3, 'keep') unchangedFuture Outlook
PostgreSQL 17 cements the database's position as the leading open-source relational database. The SQL/JSON features bring it to parity with commercial databases, while the incremental backup support addresses one of the most significant operational pain points for large deployments.
Looking ahead, the PostgreSQL community is working on parallel query improvements, logical replication enhancements for partitioned tables, and native columnar storage for analytical workloads. The trend toward making PostgreSQL a universal database—handling both transactional and analytical workloads—continues to accelerate.
Conclusion
PostgreSQL 17 delivers transformative features that impact both developers and operations teams. The incremental backup support alone justifies an upgrade for teams managing large databases, while the SQL/JSON improvements dramatically simplify working with JSON data in SQL.
Key takeaways:
- Incremental backups reduce backup time and storage by orders of magnitude for large databases
- JSON_TABLE bridges the gap between JSON documents and relational queries with optimizer-aware processing
- MERGE with RETURNING enables complex upsert patterns with full audit capability
- Query optimizer improvements deliver automatic performance gains without application changes
- Test your workloads against PostgreSQL 17 in staging before upgrading production
- Plan your backup rotation strategy around the full + incremental backup chain
- The SQL/JSON compliance makes PostgreSQL viable for workloads that previously required commercial databases