MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

Database Performance: Connection Pooling and Query Caching

Optimize database performance: PgBouncer, query result caching, and prepared statements.

DatabasePerformanceConnection PoolingCaching

By MinhVo

Introduction

Database performance is often the bottleneck that limits application scalability. While developers spend significant time optimizing queries and indexing strategies, two of the most impactful optimizations—connection pooling and query caching—are frequently overlooked or improperly configured. Together, these techniques can reduce database load by orders of magnitude while dramatically improving response times.

Connection pooling eliminates the overhead of establishing new database connections for every request, reducing connection setup time from tens of milliseconds to microseconds. Query caching eliminates redundant database round-trips entirely by serving repeated queries from memory. When combined with prepared statements, which reduce query parsing overhead, these three techniques form a comprehensive database performance optimization strategy.

This guide provides practical implementation details for each technique, covering PgBouncer configuration for connection pooling, application-level caching strategies, and prepared statement optimization. You'll learn how to measure the impact of each optimization and how to configure them for maximum benefit in production environments.

Database performance optimization

Understanding Database Performance Layers

The Performance Stack

Database performance optimization operates at multiple layers. At the lowest level, hardware and OS tuning affect I/O throughput and memory allocation. At the database level, query optimization, indexing, and configuration affect how efficiently queries execute. At the application level, connection management, caching, and query patterns determine how effectively the application uses database resources.

Connection pooling and query caching operate at the application level, providing the highest return on investment for most applications. They require minimal changes to application code while delivering significant performance improvements.

Measuring Database Overhead

Before optimizing, measure where time is actually spent. A typical database request involves: connection establishment (10-50ms), query parsing and planning (0.1-5ms), query execution (variable), and result transfer (variable). Connection pooling eliminates the first component, caching eliminates all four for repeated queries, and prepared statements reduce the second.

The 80/20 Rule of Database Performance

In most applications, 80% of database load comes from 20% of queries. Identifying and optimizing these high-frequency queries through caching and prepared statements provides disproportionate performance gains. Monitoring tools like pg_stat_statements reveal which queries consume the most resources.

Performance optimization layers

Architecture and Design Patterns

Multi-Layer Caching Architecture

Production database caching typically involves multiple layers. Application-level caches (Redis, Memcached) store query results for the most frequently accessed data. Database-level caches (shared buffers, OS page cache) cache data pages and query plans. Connection pools cache established connections.

Each layer has different characteristics: application caches offer the fastest access but require explicit invalidation logic. Database caches are transparent but limited by memory. Connection pools reduce overhead but don't eliminate query execution.

Cache-Aside Pattern

The cache-aside pattern is the most common caching strategy. Application code first checks the cache for the requested data. On a cache miss, it queries the database, stores the result in cache, and returns it. On subsequent requests, the data is served directly from cache.

Write-Through and Write-Behind Patterns

For data that's both read and written, write-through caching updates both the cache and database simultaneously, ensuring consistency at the cost of write latency. Write-behind caching queues database writes, improving write performance at the risk of data loss if the cache fails.

Step-by-Step Implementation

PgBouncer Connection Pooling Setup

# pgbouncer.ini - Production configuration
[databases]
mydb = host=primary.db.internal port=5432 dbname=mydb
mydb_ro = host=replica.db.internal port=5432 dbname=mydb
 
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
 
# Pool sizing
pool_mode = transaction
default_pool_size = 25
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3
max_client_conn = 2000
 
# Timeouts
server_lifetime = 3600
server_idle_timeout = 300
server_connect_timeout = 15
client_idle_timeout = 0
query_timeout = 30
query_wait_timeout = 120
 
# Monitoring
stats_period = 60
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1
 
# Admin access
admin_users = pgbouncer_admin
stats_users = pgbouncer_stats

Application-Level Redis Caching

import Redis from 'ioredis';
import { Pool } from 'pg';
 
const redis = new Redis({
  host: 'redis.internal',
  port: 6379,
  maxRetriesPerRequest: 3,
  retryDelayOnFailover: 100,
});
 
const pool = new Pool({
  host: 'pgbouncer.internal',
  port: 6432,
  database: 'mydb',
  max: 20,
});
 
// Cache-aside pattern implementation
async function cachedQuery<T>(
  key: string,
  queryFn: () => Promise<T>,
  ttlSeconds: number = 300
): Promise<T> {
  const cached = await redis.get(key);
  if (cached) {
    return JSON.parse(cached);
  }
 
  const result = await queryFn();
  await redis.setex(key, ttlSeconds, JSON.stringify(result));
  return result;
}
 
// Usage
async function getUserById(userId: number) {
  return cachedQuery(
    `user:${userId}`,
    async () => {
      const result = await pool.query(
        'SELECT * FROM users WHERE id = $1',
        [userId]
      );
      return result.rows[0];
    },
    600 // 10 minute TTL
  );
}
 
// Cache invalidation on write
async function updateUser(userId: number, data: Partial<User>) {
  const result = await pool.query(
    'UPDATE users SET name = $1, email = $2 WHERE id = $3 RETURNING *',
    [data.name, data.email, userId]
  );
 
  // Invalidate cache
  await redis.del(`user:${userId}`);
 
  return result.rows[0];
}

Caching implementation workflow

Prepared Statements for Query Optimization

// PostgreSQL prepared statements reduce parsing overhead
import { Pool } from 'pg';
 
const pool = new Pool({
  host: 'pgbouncer.internal',
  port: 6432,
  database: 'mydb',
  max: 20,
  // Note: prepared statements require session pooling or PgBouncer 1.21+
  statement_timeout: 30000,
});
 
// Named prepared statement
async function findUsersByStatus(status: string, limit: number) {
  const result = await pool.query({
    name: 'find-users-by-status',
    text: `
      SELECT id, name, email, created_at
      FROM users
      WHERE status = $1
      ORDER BY created_at DESC
      LIMIT $2
    `,
    values: [status, limit],
  });
  return result.rows;
}
 
// Connection-level prepared statement (requires session pooling)
async function getActiveUserStats() {
  const client = await pool.connect();
  try {
    // Prepare once per connection
    await client.query({
      name: 'user-stats',
      text: `
        SELECT 
          status,
          COUNT(*) as count,
          MIN(created_at) as earliest,
          MAX(created_at) as latest
        FROM users
        WHERE created_at > $1
        GROUP BY status
      `,
    });
 
    // Execute with different parameters
    const result = await client.query({
      name: 'user-stats',
      values: [new Date(Date.now() - 86400000 * 30)], // Last 30 days
    });
 
    return result.rows;
  } finally {
    client.release();
  }
}

Query Result Caching with Invalidation

// Sophisticated cache with tag-based invalidation
class QueryCache {
  private redis: Redis;
  private defaultTTL: number;
 
  constructor(redis: Redis, defaultTTL: number = 300) {
    this.redis = redis;
    this.defaultTTL = defaultTTL;
  }
 
  async get<T>(key: string): Promise<T | null> {
    const data = await this.redis.get(key);
    return data ? JSON.parse(data) : null;
  }
 
  async set(key: string, value: any, ttl?: number): Promise<void> {
    await this.redis.setex(key, ttl || this.defaultTTL, JSON.stringify(value));
  }
 
  // Tag-based invalidation
  async setWithTags(key: string, value: any, tags: string[], ttl?: number): Promise<void> {
    const pipeline = this.redis.pipeline();
    pipeline.setex(key, ttl || this.defaultTTL, JSON.stringify(value));
    
    for (const tag of tags) {
      pipeline.sadd(`tag:${tag}`, key);
    }
    
    await pipeline.exec();
  }
 
  async invalidateByTag(tag: string): Promise<void> {
    const keys = await this.redis.smembers(`tag:${tag}`);
    if (keys.length > 0) {
      const pipeline = this.redis.pipeline();
      for (const key of keys) {
        pipeline.del(key);
      }
      pipeline.del(`tag:${tag}`);
      await pipeline.exec();
    }
  }
}
 
// Usage
const cache = new QueryCache(redis);
 
async function getProductsByCategory(categoryId: number) {
  const cacheKey = `products:category:${categoryId}`;
  
  return cache.get(cacheKey) || (async () => {
    const result = await pool.query(
      'SELECT * FROM products WHERE category_id = $1',
      [categoryId]
    );
    await cache.setWithTags(cacheKey, result.rows, [
      `category:${categoryId}`,
      'products'
    ], 600);
    return result.rows;
  })();
}
 
// Invalidate all product caches when a product is updated
async function updateProduct(productId: number, data: any) {
  await pool.query('UPDATE products SET ...', [/* params */]);
  await cache.invalidateByTag('products');
}

Real-World Use Cases

High-Traffic API Gateway

An API gateway handling 100,000 requests per second reduced its database connections from 5,000 to 100 using PgBouncer with transaction pooling. Combined with Redis caching for frequently accessed configuration data, the gateway's P99 latency dropped from 200ms to 15ms, and database CPU utilization decreased from 90% to 20%.

E-Commerce Product Catalog

An e-commerce platform with 500,000 products implemented a multi-layer caching strategy. Hot products (top 1% by traffic) were cached in application memory with 1-minute TTL. Warm products were cached in Redis with 5-minute TTL. Cold products were served from the database through PgBouncer. This approach reduced database queries by 95% while maintaining data freshness.

Real-Time Analytics Dashboard

An analytics dashboard displaying real-time metrics implemented a hybrid caching strategy. Counters and aggregates were cached in Redis with 10-second TTL for live data. Historical data was cached with 1-hour TTL. The underlying database queries were optimized with prepared statements and materialized views.

Best Practices for Production

1. Monitor Cache Hit Rates

Track cache hit ratio (hits / (hits + misses)) for each cached query. A hit rate below 80% suggests the cache isn't effective—either the TTL is too short, the cache keys are too specific, or the access pattern doesn't benefit from caching.

2. Implement Cache Invalidation Strategy

Cache invalidation is the hardest problem in caching. Use tag-based invalidation for related data, time-based TTL for data that can tolerate staleness, and explicit invalidation for critical data that must be consistent.

3. Set Appropriate TTLs Based on Data Freshness Requirements

Configuration data: 1 hour to 24 hours. User profiles: 5-15 minutes. Transactional data: 30 seconds to 5 minutes. Real-time data: 5-30 seconds. Never cache data that must be immediately consistent without explicit invalidation.

4. Use Connection Pooling for All Database Access

Every application that connects to a database should use connection pooling. Even low-traffic applications benefit from reduced connection overhead and protection against connection storms.

5. Separate Read and Write Pools

Use read replicas with separate connection pools for read-heavy workloads. This prevents read traffic from saturating connections needed for writes and enables horizontal scaling of read capacity.

6. Implement Circuit Breakers for Cache Failures

When the cache becomes unavailable, the application should fall back to direct database queries rather than failing entirely. Circuit breakers prevent cache failures from cascading into application failures.

7. Profile Query Performance Before Caching

Not all queries benefit from caching. Profile your queries to identify those with high execution cost or high frequency—these are the best candidates for caching.

8. Use Prepared Statements for Repeated Queries

Prepared statements reduce query parsing overhead for queries executed many times with different parameters. They're particularly effective for parameterized queries in OLTP workloads.

Common Pitfalls and Solutions

PitfallImpactSolution
No cache invalidationStale data served to usersImplement tag-based or event-driven invalidation
Cache stampede on TTL expirySudden database load spikeUse probabilistic early expiration or background refresh
Too many prepared statementsMemory exhaustion on databaseLimit statement cache size, use LRU eviction
Session pooling with PgBouncerPrepared statements failUse transaction pooling or PgBouncer 1.21+ with protocol-level support
Cache key collisionsWrong data returnedUse structured, unique cache keys with versioning
Caching write-heavy dataPoor cache hit rate, wasted resourcesOnly cache read-heavy data with stable access patterns

Performance Optimization

Monitoring Cache Effectiveness

// Cache metrics collection
interface CacheMetrics {
  hits: number;
  misses: number;
  hitRate: number;
  avgLatencyMs: number;
  keyCount: number;
  memoryUsageMB: number;
}
 
async function collectCacheMetrics(redis: Redis): Promise<CacheMetrics> {
  const info = await redis.info('stats');
  const memory = await redis.info('memory');
  
  const hits = parseInt(info.match(/keyspace_hits:(\d+)/)?.[1] || '0');
  const misses = parseInt(info.match(/keyspace_misses:(\d+)/)?.[1] || '0');
  
  return {
    hits,
    misses,
    hitRate: hits / (hits + misses),
    avgLatencyMs: 0, // Measure with custom instrumentation
    keyCount: await redis.dbsize(),
    memoryUsageMB: parseInt(memory.match(/used_memory:(\d+)/)?.[1] || '0') / 1024 / 1024,
  };
}

Connection Pool Monitoring

-- PgBouncer monitoring queries
SHOW POOLS;  -- View pool status
SHOW STATS;  -- View connection statistics
SHOW CLIENTS; -- View connected clients
SHOW SERVERS; -- View server connections
 
-- Key metrics to monitor:
-- cl_active: Active client connections
-- sv_active: Active server connections  
-- cl_waiting: Clients waiting for connection
-- sv_login: Server connections being established
-- avg_query_time: Average query execution time

Query Plan Caching

-- PostgreSQL shared plan cache
EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM users WHERE email = $1;
 
-- Monitor plan cache effectiveness
SELECT 
    query,
    calls,
    total_exec_time,
    mean_exec_time,
    rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

Comparison with Alternatives

FeatureRedis CacheMemcachedApplication CachePgBouncer
TypeDistributedDistributedIn-processConnection pool
PersistenceOptional (RDB/AOF)NoNoNo
Data StructuresRich (lists, sets, sorted sets)Simple key-valueAnyN/A
Max SizeLimited by memoryLimited by memoryLimited by JVM/process memoryN/A
ClusteringBuilt-inClient-sideN/AExternal
Best ForComplex caching, pub/subSimple key-value cachingUltra-low latencyDatabase connections

Advanced Patterns

Cache Warming

// Pre-populate cache with frequently accessed data
async function warmCache(): Promise<void> {
  console.log('Starting cache warm-up...');
  
  // Warm product categories
  const categories = await pool.query('SELECT * FROM categories WHERE active = true');
  for (const cat of categories.rows) {
    await redis.setex(`category:${cat.id}`, 3600, JSON.stringify(cat));
  }
  
  // Warm popular products
  const popularProducts = await pool.query(`
    SELECT p.* FROM products p
    JOIN product_views pv ON p.id = pv.product_id
    WHERE pv.viewed_at > NOW() - INTERVAL '7 days'
    GROUP BY p.id
    ORDER BY COUNT(*) DESC
    LIMIT 1000
  `);
  
  for (const product of popularProducts.rows) {
    await redis.setex(`product:${product.id}`, 1800, JSON.stringify(product));
  }
  
  console.log(`Warmed ${categories.rows.length} categories and ${popularProducts.rows.length} products`);
}

Cache-Aside with Stale-While-Revalidate

// Serve stale data while refreshing cache in background
async function staleWhileRevalidate<T>(
  key: string,
  fetchFn: () => Promise<T>,
  ttl: number,
  staleTtl: number
): Promise<T> {
  const cached = await redis.get(key);
  
  if (cached) {
    const data = JSON.parse(cached);
    
    // Check if data is stale but still usable
    const age = await redis.ttl(key);
    if (age < 0) {
      // Data expired, fetch fresh
      const fresh = await fetchFn();
      await redis.setex(key, ttl, JSON.stringify(fresh));
      return fresh;
    }
    
    // If TTL is less than stale threshold, refresh in background
    if (age < staleTtl) {
      fetchFn().then(fresh => redis.setex(key, ttl, JSON.stringify(fresh)));
    }
    
    return data;
  }
  
  // Cache miss
  const fresh = await fetchFn();
  await redis.setex(key, ttl, JSON.stringify(fresh));
  return fresh;
}

Testing Strategies

// Test cache behavior
describe('Query Cache', () => {
  let cache: QueryCache;
  let mockRedis: Redis;
 
  beforeEach(() => {
    mockRedis = new MockRedis();
    cache = new QueryCache(mockRedis, 60);
  });
 
  it('should return cached data on hit', async () => {
    await mockRedis.setex('test:key', 60, JSON.stringify({ id: 1 }));
    const result = await cache.get('test:key');
    expect(result).toEqual({ id: 1 });
  });
 
  it('should return null on miss', async () => {
    const result = await cache.get('nonexistent');
    expect(result).toBeNull();
  });
 
  it('should invalidate by tag', async () => {
    await cache.setWithTags('key1', { a: 1 }, ['tag1']);
    await cache.setWithTags('key2', { b: 2 }, ['tag1', 'tag2']);
    await cache.invalidateByTag('tag1');
    
    expect(await cache.get('key1')).toBeNull();
    expect(await cache.get('key2')).toBeNull();
  });
});

Future Outlook

Database performance optimization is evolving toward intelligent, adaptive systems. AI-driven caching that automatically determines optimal TTLs and invalidation strategies based on access patterns is an emerging trend. Cloud-native databases like Aurora and AlloyDB integrate caching at the storage layer, providing transparent performance improvements without application changes.

The convergence of connection pooling and caching is also advancing. Tools like Supabase's Supavisor combine connection multiplexing with query result caching. Edge computing pushes caching closer to users, reducing both database load and network latency.

Conclusion

Connection pooling, query caching, and prepared statements form a comprehensive database performance optimization strategy. PgBouncer eliminates connection overhead, Redis caching eliminates redundant queries, and prepared statements reduce parsing overhead. Together, they can reduce database load by 90% or more while improving response times by orders of magnitude.

Start by implementing connection pooling—it's the lowest-risk, highest-impact optimization. Add query caching for your most frequently accessed data, and use prepared statements for parameterized queries. Monitor cache hit rates, connection utilization, and query performance to continuously optimize your configuration. With these techniques in place, your database will scale to handle significantly more traffic with lower resource consumption.