PostgreSQL 17: New Features and Improvements

Introduction

PostgreSQL 17, released in September 2024, continues the tradition of major releases that push the boundaries of what an open-source relational database can do. Every major PostgreSQL release brings a mix of performance improvements, new SQL features, and operational enhancements that make it easier to run PostgreSQL at scale. PostgreSQL 17 is no exception—it delivers some of the most requested features from the community, including incremental backup support, the long-awaited JSON_TABLE function, significant improvements to the MERGE command, and a host of query optimizer enhancements that make complex queries faster without any application changes.

For teams running PostgreSQL in production, every release represents an opportunity to reduce operational overhead, improve query performance, and adopt new SQL patterns that simplify application code. PostgreSQL 17 is particularly significant because it bridges the gap between PostgreSQL and commercial databases by implementing SQL/JSON standard features that application developers have been requesting for years.

This guide covers the most impactful features in PostgreSQL 17, explains how they work internally, provides practical examples for adopting them, and discusses the migration considerations for teams planning an upgrade.

Understanding PostgreSQL 17: Core Concepts

Incremental Backup with pg_basebackup

Before PostgreSQL 17, taking a backup of a large PostgreSQL cluster meant either running pg_basebackup for a full physical backup (which copies every data file regardless of whether it changed) or using pg_dump for a logical backup (which is slower and doesn't preserve the physical layout). For a 5TB database, a full pg_basebackup takes hours and transfers 5TB of data every time—even if only 10GB changed since the last backup.

PostgreSQL 17 introduces incremental backup support through pg_basebackup --incremental. This feature uses the WAL (Write-Ahead Log) to track which data blocks have changed since a previous backup. When you run an incremental backup, PostgreSQL only copies the blocks that have been modified, dramatically reducing backup time and storage requirements.

The mechanism works through a new concept called a "backup manifest." When you take a full backup, PostgreSQL generates a manifest file that records the checksum and LSN (Log Sequence Number) of every block in the database. When you later take an incremental backup against that full backup, PostgreSQL compares each block's current LSN with the manifest. Blocks that haven't changed since the full backup are skipped. The result is a delta backup that contains only the changed blocks, which can be applied to the full backup to restore the database to the incremental backup's point in time.

JSON_TABLE: Bridging SQL and JSON

The JSON_TABLE function is arguably the most requested SQL feature in recent PostgreSQL history. It allows you to extract data from a JSON document and present it as a virtual relational table that you can join, filter, and aggregate with standard SQL. This bridges the gap between document-oriented data storage and relational querying.

Before JSON_TABLE, working with JSON arrays in PostgreSQL required awkward combinations of json_array_elements() with lateral joins and multiple ->> operators. The resulting queries were hard to read, difficult to optimize, and didn't compose well with other SQL features.

MERGE Enhancements

The MERGE command, introduced in PostgreSQL 15, allows you to perform INSERT, UPDATE, or DELETE operations in a single statement based on a join condition. PostgreSQL 17 adds several enhancements including support for RETURNING clauses (which was a major gap in the original implementation), the ability to reference the target table in subqueries within the MERGE statement, and improved error handling with better constraint violation messages.

Query Optimizer Improvements

PostgreSQL 17 includes several optimizer improvements that make complex queries faster without application changes. Notable improvements include better estimation for queries with multiple range conditions, improved handling of IN lists with many values, more efficient plans for queries that combine GROUP BY with ORDER BY and LIMIT, and enhanced statistics for JSON and array columns.

Architecture and Design Patterns

Incremental Backup Architecture

The incremental backup architecture introduces a three-tier backup strategy:

Full backups serve as the base. You take a full backup periodically (weekly or monthly) using pg_basebackup. This backup includes the complete data directory and a backup manifest.

Incremental backups are taken more frequently (daily or hourly) against the most recent full backup. Each incremental backup only contains the changed blocks, making it fast to take and small to store.

WAL archiving captures the write-ahead log continuously, providing point-in-time recovery capability between backup points.

To restore from this backup chain, you use pg_combinebackup to merge the full backup with subsequent incremental backups, then replay WAL archives to reach the desired point in time.

SQL/JSON Processing Architecture

PostgreSQL 17's SQL/JSON implementation follows the ISO/IEC 9075-2 standard, which defines a set of SQL functions and clauses for working with JSON data. The key architectural decision is that JSON operations can participate in the query optimizer's planning process. When you write a JSON_TABLE expression, the optimizer can push predicates down into the JSON processing, avoiding the materialization of unnecessary rows.

This is a significant improvement over the pre-17 approach where json_array_elements() would always materialize the entire array before any filtering could happen. With JSON_TABLE, the optimizer understands the relationship between the JSON source and the virtual columns, enabling more efficient execution plans.

Step-by-Step Implementation

Setting Up Incremental Backups

First, configure WAL archiving (required for incremental backups):

-- postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/%f'

Take a full backup with manifest:

pg_basebackup -D /backups/full -Ft -z --manifest-checksums=SHA256 \
  --manifest-force-encode \
  -X fetch

Take an incremental backup against the full backup:

pg_basebackup -D /backups/incr1 -Ft -z \
  --incremental=/backups/full/backup_manifest \
  -X fetch

Combine backups for restoration:

pg_combinebackup /backups/full /backups/incr1 -o /backups/combined

Using JSON_TABLE

Extract structured data from JSON documents into a relational result set:

-- Sample data: orders with JSON line items
CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  customer_id INT NOT NULL,
  order_date DATE NOT NULL,
  line_items JSONB NOT NULL
);
 
INSERT INTO orders (customer_id, order_date, line_items) VALUES
(1, '2024-09-01', '[{"sku":"A1","qty":2,"price":29.99},{"sku":"B2","qty":1,"price":49.99}]'),
(2, '2024-09-02', '[{"sku":"A1","qty":5,"price":29.99},{"sku":"C3","qty":3,"price":19.99}]');
 
-- Use JSON_TABLE to extract line items into a virtual table
SELECT
  o.id AS order_id,
  o.customer_id,
  jt.sku,
  jt.quantity,
  jt.unit_price,
  jt.quantity * jt.unit_price AS line_total
FROM orders o,
JSON_TABLE(
  o.line_items,
  '$[*]' COLUMNS (
    sku TEXT PATH '$.sku',
    quantity INT PATH '$.qty',
    unit_price NUMERIC PATH '$.price'
  )
) AS jt
WHERE jt.quantity > 1;
 
-- Result:
-- order_id | customer_id | sku | quantity | unit_price | line_total
-- ---------+-------------+-----+----------+------------+-----------
--        1 |           1 | A1  |        2 |      29.99 |     59.98
--        2 |           1 | A1  |        5 |      29.99 |    149.95
--        2 |           2 | C3  |        3 |      19.99 |     59.97

JSON_TABLE with Nested JSON and Error Handling

Handle nested JSON objects and provide defaults for missing fields:

SELECT
  o.id,
  jt.customer_name,
  jt.shipping_city,
  jt.shipping_country
FROM orders o,
JSON_TABLE(
  o.metadata,
  '$' COLUMNS (
    customer_name TEXT PATH '$.customer.name' DEFAULT 'Unknown' ON EMPTY,
    NESTED PATH '$.shipping_address[*]' COLUMNS (
      city TEXT PATH '$.city',
      country TEXT PATH '$.country' DEFAULT 'US' ON EMPTY,
      zip TEXT PATH '$.zip'
    )
  )
  ERROR ON ERROR
) AS jt;

Enhanced MERGE with RETURNING

PostgreSQL 17's MERGE now supports RETURNING, which was the most notable gap in the original implementation:

-- Upsert inventory with full audit trail
MERGE INTO inventory AS target
USING (
  VALUES
    ('SKU-A1', 100, CURRENT_DATE),
    ('SKU-B2', 50, CURRENT_DATE),
    ('SKU-C3', 200, CURRENT_DATE)
) AS source(sku, quantity, last_updated)
ON target.sku = source.sku
WHEN MATCHED AND target.quantity + source.quantity > 1000 THEN
  DELETE
WHEN MATCHED THEN
  UPDATE SET
    quantity = target.quantity + source.quantity,
    last_updated = source.last_updated
WHEN NOT MATCHED THEN
  INSERT (sku, quantity, last_updated)
  VALUES (source.sku, source.quantity, source.last_updated)
RETURNING
  merge_action() AS action,
  sku,
  quantity,
  last_updated;
 
-- Result:
-- action  | sku    | quantity | last_updated
-- --------+--------+----------+-------------
-- INSERT  | SKU-A1 |      100 | 2024-09-10
-- UPDATE  | SKU-B2 |      275 | 2024-09-10
-- INSERT  | SKU-C3 |      200 | 2024-09-10

Aggregating JSON Data with JSON_OBJECTAGG and JSON_ARRAYAGG

PostgreSQL 17 enhances JSON aggregate functions with improved performance and additional options:

-- Build a JSON summary of orders per customer
SELECT
  c.name,
  JSON_OBJECTAGG(
    KEY o.order_date
    VALUE JSON_OBJECT(
      'order_id' VALUE o.id,
      'total' VALUE o.total,
      'item_count' VALUE o.item_count
    )
    ABSENT ON NULL
  ) AS orders_by_date
FROM customers c
JOIN orders o ON o.customer_id = c.id
GROUP BY c.name;

Real-World Use Cases and Case Studies

Use Case 1: ETL Pipeline Simplification

A data engineering team processes millions of JSON events daily from an event streaming platform. Previously, they used a combination of json_array_elements() with lateral joins and multiple ->> extractions to parse event payloads. The queries were 50+ lines long and took minutes to execute on large batches. After upgrading to PostgreSQL 17 and adopting JSON_TABLE, the same queries are 15 lines long, run 3x faster due to optimizer pushdown, and are significantly easier for new team members to understand.

Use Case 2: Backup Cost Reduction for Large Databases

A SaaS company running a 2TB PostgreSQL database was spending 4 hours on each nightly backup and storing 2TB of backup data per day (14TB per week). After adopting incremental backups, nightly backups complete in 15-30 minutes (only 50-200GB of changed blocks), and weekly storage requirements dropped from 14TB to approximately 3TB (one full backup plus six incrementals).

Use Case 3: Complex Upsert Workflows

A financial application needs to process daily transaction reconciliation files that contain a mix of new transactions, updates to existing ones, and corrections that should trigger deletions. Before PostgreSQL 17's enhanced MERGE, this required separate INSERT, UPDATE, and DELETE statements wrapped in a transaction, with complex logic to determine which operation to apply. The MERGE statement with RETURNING consolidates this into a single, readable statement that also returns a log of what changed for the audit trail.

Best Practices for Production

Test incremental backup restoration before relying on it: Always verify your backup chain by restoring to a test instance. Run pg_combinebackup and replay WAL to ensure the restored database is consistent and complete.
Use JSON_TABLE for complex JSON extraction: Replace convoluted json_array_elements() + lateral join patterns with JSON_TABLE. It produces cleaner code and enables optimizer optimizations that the old approach cannot.
Set appropriate manifest checksums for backups: Use SHA256 checksums in backup manifests for strong integrity verification. This adds minimal overhead to backup time but catches corruption early.
Index JSON paths used in JSON_TABLE predicates: Create GIN indexes on JSONB columns or expression indexes on frequently queried JSON paths to speed up JSON_TABLE operations.
Monitor MERGE performance with EXPLAIN ANALYZE: The MERGE command can generate complex execution plans. Use EXPLAIN ANALYZE to verify that the optimizer is using efficient join strategies and index scans.
Plan your backup retention strategy: With incremental backups, you need to manage the chain of full + incremental backups. Keep the last full backup and all subsequent incrementals until the next full backup is verified.
Upgrade test environments first: Test your application queries against PostgreSQL 17 in a staging environment before production upgrade. The optimizer changes may produce different execution plans for existing queries.
Use the new merge_action() function for auditing: In MERGE statements with RETURNING, call merge_action() to identify whether each returned row was inserted, updated, or deleted. This is invaluable for audit logging and change data capture.

Common Pitfalls and Solutions

Pitfall	Impact	Solution
Losing incremental backup chain	Cannot restore to latest point	Always keep the full backup manifest until the next full backup; use `pg_verifybackup` to validate chains
JSON_TABLE on unindexed JSONB columns	Slow full table scans	Create GIN indexes on JSONB columns or expression indexes on extracted paths
MERGE without proper unique constraints	Duplicate rows on concurrent inserts	Ensure the ON clause references columns with a unique constraint
Upgrading without testing optimizer changes	Slower queries after upgrade	Run EXPLAIN ANALYZE on critical queries in staging before production upgrade
Using JSON_TABLE when a normalized schema is better	Performance degradation at scale	Use JSON_TABLE for flexible/exploratory data; normalize when the schema is stable
Not archiving WAL before enabling incremental backup	Backup failures	Configure `archive_mode = on` and verify WAL archiving works before taking incremental backups

Performance Optimization

Optimizing JSON_TABLE Queries

-- Create a GIN index on the JSONB column for faster filtering
CREATE INDEX idx_orders_line_items ON orders USING GIN (line_items);
 
-- Use a partial index for frequently queried JSON paths
CREATE INDEX idx_orders_sku_a1 ON orders
USING GIN ((line_items -> 'sku'))
WHERE line_items @> '[{"sku": "A1"}]';
 
-- Example query that benefits from the index
SELECT o.id, jt.sku, jt.qty
FROM orders o,
JSON_TABLE(o.line_items, '$[*]' COLUMNS (sku TEXT PATH '$.sku', qty INT PATH '$.qty')) AS jt
WHERE jt.sku = 'A1';

Backup Performance Tuning

# Use parallel workers for faster backup
pg_basebackup -D /backups/full -Ft -z -X fetch \
  --manifest-checksums=SHA256 \
  --checkpoint=fast \
  --max-rate=500M  # Limit I/O rate to prevent production impact

Comparison with Alternatives

Feature	PostgreSQL 17	MySQL 8.0	SQL Server 2022	Oracle 23c
Incremental Backup	Native (pg_basebackup)	InnoDB redo log	Native	RMAN incremental
JSON_TABLE	Full SQL/JSON standard	JSON_TABLE (limited)	OPENJSON (non-standard)	Full SQL/JSON
MERGE with RETURNING	Yes (new in 17)	INSERT ON DUPLICATE KEY	MERGE with OUTPUT	MERGE with RETURNING
Backup Manifests	SHA256 checksums	N/A	Checksum verification	RMAN validation
SQL/JSON Compliance	High	Partial	Partial	High
Cost	Free (open-source)	Free + paid editions	Paid licenses	Paid licenses

Advanced Patterns and Techniques

LATERAL + JSON_TABLE for Complex Transformations

Combine JSON_TABLE with LATERAL joins for multi-step JSON transformations:

SELECT
  o.id,
  jt.product_name,
  jt.variants.name AS variant_name,
  jt.variants.price AS variant_price
FROM orders o,
LATERAL (
  SELECT * FROM JSON_TABLE(
    o.line_items,
    '$[*]' COLUMNS (
      product_name TEXT PATH '$.name',
      NESTED PATH '$.variants[*]' COLUMNS (
        name TEXT PATH '$.name',
        price NUMERIC PATH '$.price',
        sku TEXT PATH '$.sku'
      )
    )
  )
) AS jt
WHERE jt.variants.price > 20;

Backup Rotation Script

Automate your backup rotation with a shell script:

#!/bin/bash
BACKUP_DIR="/backups"
FULL_BACKUP_INTERVAL_DAYS=7
LATEST_FULL=$(ls -td ${BACKUP_DIR}/full_* 2>/dev/null | head -1)
 
# Check if full backup is older than interval
if [ -z "$LATEST_FULL" ] || \
   [ $(find "$LATEST_FULL" -maxdepth 0 -mtime +${FULL_BACKUP_INTERVAL_DAYS}) ]; then
  echo "Taking full backup..."
  pg_basebackup -D "${BACKUP_DIR}/full_$(date +%Y%m%d)" -Ft -z \
    --manifest-checksums=SHA256 -X fetch
else
  echo "Taking incremental backup against $(basename $LATEST_FULL)..."
  pg_basebackup -D "${BACKUP_DIR}/incr_$(date +%Y%m%d_%H%M)" -Ft -z \
    --incremental="${LATEST_FULL}/backup_manifest" -X fetch
fi

Testing Strategies

Test your PostgreSQL 17 features with comprehensive integration tests:

-- Test JSON_TABLE extraction
DO $$
DECLARE
  result_count INT;
BEGIN
  SELECT COUNT(*) INTO result_count
  FROM (VALUES ('[{"a":1},{"a":2},{"a":3}]'::jsonb)) AS t(data),
  JSON_TABLE(t.data, '$[*]' COLUMNS (a INT PATH '$.a')) AS jt;
 
  ASSERT result_count = 3, 'Expected 3 rows from JSON_TABLE';
  RAISE NOTICE 'JSON_TABLE test passed: % rows', result_count;
END;
$$;
 
-- Test MERGE with RETURNING
CREATE TEMP TABLE target (id INT PRIMARY KEY, val TEXT);
INSERT INTO target VALUES (1, 'old'), (3, 'keep');
 
WITH result AS (
  MERGE INTO target t
  USING (VALUES (1, 'new'), (2, 'insert')) AS s(id, val)
  ON t.id = s.id
  WHEN MATCHED THEN UPDATE SET val = s.val
  WHEN NOT MATCHED THEN INSERT VALUES (s.id, s.val)
  RETURNING merge_action() AS action, id, val
)
SELECT * FROM result;
-- Expected: (UPDATE, 1, 'new'), (INSERT, 2, 'insert'), existing (3, 'keep') unchanged

Future Outlook

PostgreSQL 17 cements the database's position as the leading open-source relational database. The SQL/JSON features bring it to parity with commercial databases, while the incremental backup support addresses one of the most significant operational pain points for large deployments.

Looking ahead, the PostgreSQL community is working on parallel query improvements, logical replication enhancements for partitioned tables, and native columnar storage for analytical workloads. The trend toward making PostgreSQL a universal database—handling both transactional and analytical workloads—continues to accelerate.

Conclusion

PostgreSQL 17 delivers transformative features that impact both developers and operations teams. The incremental backup support alone justifies an upgrade for teams managing large databases, while the SQL/JSON improvements dramatically simplify working with JSON data in SQL.

Key takeaways:

Incremental backups reduce backup time and storage by orders of magnitude for large databases
JSON_TABLE bridges the gap between JSON documents and relational queries with optimizer-aware processing
MERGE with RETURNING enables complex upsert patterns with full audit capability
Query optimizer improvements deliver automatic performance gains without application changes
Test your workloads against PostgreSQL 17 in staging before upgrading production
Plan your backup rotation strategy around the full + incremental backup chain
The SQL/JSON compliance makes PostgreSQL viable for workloads that previously required commercial databases

Minh Vo

Slaying code & making it lit fr fr 🔥 tagline