MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

Database Sharding: Horizontal Scaling Strategies

Implement database sharding: shard keys, routing, cross-shard queries, and rebalancing.

DatabaseShardingScalingArchitecture

By MinhVo

Introduction

When your single database instance hits the wall—CPU at 80%, disk I/O saturated, and queries queuing up—you've exhausted vertical scaling. Database sharding distributes your data across multiple database instances, unlocking horizontal scalability that can grow with your application indefinitely.

But sharding is not a free lunch. It introduces distributed systems complexity: cross-shard queries become expensive joins across network boundaries, transactions spanning multiple shards require distributed coordination, and rebalancing data when adding shards is an operational minefield. Instagram famously runs on sharded PostgreSQL, handling over 1 billion queries per second across thousands of shards. Getting the strategy right from the start saves years of painful migrations.

This guide covers shard key selection, routing strategies, cross-shard query patterns, and the rebalancing techniques that keep your system running as data grows.

Horizontal scaling architecture

Understanding Sharding: Core Concepts

Sharding partitions your dataset into smaller, independent subsets called shards. Each shard is a complete database instance containing a subset of the total data. The critical design decisions are: what to shard on, how to route queries, and how to handle data that spans multiple shards.

Shard Key Selection

The shard key determines how data is distributed across shards. Choosing the wrong key can create hot shards (uneven data distribution) or make common queries expensive (requiring scatter-gather across all shards).

Good shard keys have these properties:

  • High cardinality: Many distinct values ensure even distribution
  • Even distribution: Values should be roughly uniformly distributed
  • Query locality: Most queries should target a single shard
  • Immutability: The shard key value should not change (re-sharding is expensive)
-- Good: tenant_id for multi-tenant SaaS (high cardinality, excellent locality)
-- All queries for a tenant hit one shard
CREATE TABLE orders (
  id UUID PRIMARY KEY,
  tenant_id UUID NOT NULL,
  user_id UUID NOT NULL,
  total DECIMAL(10,2),
  created_at TIMESTAMPTZ DEFAULT now()
) PARTITION BY HASH (tenant_id);
 
-- Create 16 hash partitions (shards)
CREATE TABLE orders_p0 PARTITION OF orders FOR VALUES WITH (MODULUS 16, REMAINDER 0);
CREATE TABLE orders_p1 PARTITION OF orders FOR VALUES WITH (MODULUS 16, REMAINDER 1);
-- ... up to orders_p15
 
-- Bad: created_at as shard key (sequential inserts create hot shard on latest partition)
-- Bad: country_code as shard key (US gets 40% of data, tiny countries get almost nothing)

Hash-Based vs Range-Based Sharding

Hash-based sharding applies a hash function to the shard key, distributing data uniformly across shards. It prevents hot spots but makes range queries expensive since consecutive values may land on different shards.

Range-based sharding assigns contiguous key ranges to each shard. This preserves range locality (great for time-series or alphabetical data) but can create hot spots if access patterns are skewed.

// Hash-based shard routing
import { createHash } from 'crypto';
 
function getShardId(key: string, numShards: number): number {
  const hash = createHash('md5').update(key).digest();
  // Use first 4 bytes for better distribution than modulo
  return hash.readUInt32BE(0) % numShards;
}
 
// Range-based shard routing (e.g., user ID ranges)
function getShardIdByRange(userId: string, ranges: Array<{ shard: number; min: string; max: string }>): number {
  for (const range of ranges) {
    if (userId >= range.min && userId <= range.max) {
      return range.shard;
    }
  }
  throw new Error(`No shard found for userId: ${userId}`);
}
 
// Shard configuration
const SHARD_CONFIG = {
  numShards: 16,
  shards: Array.from({ length: 16 }, (_, i) => ({
    id: i,
    host: `db-shard-${i}.internal`,
    port: 5432,
  })),
};

Sharding strategies comparison

Architecture: Routing and Query Patterns

Application-Level Routing

The most common approach is application-level routing, where the application determines which shard to query based on the shard key in the request context.

import { Pool } from 'pg';
 
class ShardRouter {
  private pools: Map<number, Pool> = new Map();
 
  constructor(config: typeof SHARD_CONFIG) {
    for (const shard of config.shards) {
      this.pools.set(shard.id, new Pool({
        host: shard.host,
        port: shard.port,
        database: 'myapp',
        max: 10,
      }));
    }
  }
 
  getPool(shardKey: string): Pool {
    const shardId = getShardId(shardKey, SHARD_CONFIG.numShards);
    return this.pools.get(shardId)!;
  }
 
  async queryOnShard(shardKey: string, sql: string, params?: any[]) {
    const pool = this.getPool(shardKey);
    return pool.query(sql, params);
  }
 
  // Scatter-gather: query ALL shards and merge results
  async scatterGather(sql: string, params?: any[]): Promise<any[]> {
    const results = await Promise.all(
      Array.from(this.pools.values()).map(pool => pool.query(sql, params))
    );
    return results.flatMap(r => r.rows);
  }
}
 
const router = new ShardRouter(SHARD_CONFIG);
 
// Single-shard query (fast)
const userOrders = await router.queryOnShard(
  userId,
  'SELECT * FROM orders WHERE user_id = $1',
  [userId]
);
 
// Cross-shard query (expensive - avoid when possible)
const allRecentOrders = await router.scatterGather(
  'SELECT * FROM orders WHERE created_at > $1 ORDER BY created_at DESC LIMIT 100',
  [new Date(Date.now() - 86400000)]
);

Proxy-Level Routing

Tools like Vitess, ProxySQL, or Citus handle routing at the database proxy layer, making sharding transparent to the application.

Step-by-Step Implementation

Implementing Consistent Hashing for Elastic Sharding

Simple modulo hashing (hash(key) % numShards) has a catastrophic property: when you add or remove a shard, nearly all keys need to be remapped. Consistent hashing minimizes data movement to approximately 1/N of keys when adding a shard.

import { createHash } from 'crypto';
 
class ConsistentHashRing {
  private ring: Map<number, string> = new Map();
  private sortedKeys: number[] = [];
  private virtualNodes: number;
 
  constructor(nodes: string[], virtualNodes: number = 150) {
    this.virtualNodes = virtualNodes;
    for (const node of nodes) {
      this.addNode(node);
    }
  }
 
  addNode(node: string): void {
    for (let i = 0; i < this.virtualNodes; i++) {
      const key = this.hash(`${node}:vn${i}`);
      this.ring.set(key, node);
      this.sortedKeys.push(key);
    }
    this.sortedKeys.sort((a, b) => a - b);
  }
 
  removeNode(node: string): void {
    for (let i = 0; i < this.virtualNodes; i++) {
      const key = this.hash(`${node}:vn${i}`);
      this.ring.delete(key);
      this.sortedKeys = this.sortedKeys.filter(k => k !== key);
    }
  }
 
  getNode(key: string): string {
    if (this.ring.size === 0) throw new Error('No nodes in ring');
    const hash = this.hash(key);
    // Binary search for the first node clockwise from the key
    let lo = 0, hi = this.sortedKeys.length - 1;
    while (lo <= hi) {
      const mid = (lo + hi) >> 1;
      if (this.sortedKeys[mid] < hash) lo = mid + 1;
      else hi = mid - 1;
    }
    const ringKey = this.sortedKeys[lo % this.sortedKeys.length];
    return this.ring.get(ringKey)!;
  }
 
  private hash(key: string): number {
    return createHash('md5').update(key).digest().readUInt32BE(0);
  }
}
 
// Usage
const ring = new ConsistentHashRing(['shard-0', 'shard-1', 'shard-2', 'shard-3']);
console.log(ring.getNode('user-12345')); // e.g., "shard-2"
 
// Adding a new shard only moves ~25% of keys
ring.addNode('shard-4');

Consistent hashing ring visualization

Real-World Use Cases

Use Case 1: Multi-Tenant SaaS with Tenant-Based Sharding

A B2B SaaS platform with 10,000 tenants shards by tenant_id. Each tenant's data lives on one shard, making all tenant-scoped queries single-shard operations. Large tenants can be moved to dedicated shards.

async function moveTenant(tenantId: string, fromShard: number, toShard: number) {
  const sourcePool = router.getPool(String(tenantId));
  const targetPool = router.getPool(String(tenantId)); // After re-routing
 
  // 1. Lock tenant for writes
  await sourcePool.query(`UPDATE tenants SET migration_lock = true WHERE id = $1`, [tenantId]);
 
  // 2. Copy all tenant data in a consistent snapshot
  const tables = ['orders', 'users', 'products', 'invoices'];
  for (const table of tables) {
    const rows = await sourcePool.query(`SELECT * FROM ${table} WHERE tenant_id = $1`, [tenantId]);
    if (rows.rows.length > 0) {
      const columns = Object.keys(rows.rows[0]).join(', ');
      const placeholders = rows.rows[0].map((_, i) => `$${i + 1}`).join(', ');
      for (const row of rows.rows) {
        await targetPool.query(`INSERT INTO ${table} (${columns}) VALUES (${placeholders})`, Object.values(row));
      }
    }
  }
 
  // 3. Update routing metadata
  await updateShardMap(tenantId, toShard);
 
  // 4. Unlock tenant
  await sourcePool.query(`UPDATE tenants SET migration_lock = false WHERE id = $1`, [tenantId]);
}

Use Case 2: Social Media with User-Based Sharding

A social media platform shards by user_id. User profiles, posts, and activity are all on the same shard. The challenge is the "follow" relationship—when user A follows user B, the follow record must be stored on both users' shards.

// Dual-write pattern for cross-shard relationships
async function followUser(followerId: string, followeeId: string) {
  const followerShard = router.getPool(followerId);
  const followeeShard = router.getPool(followeeId);
 
  if (followerShard === followeeShard) {
    // Same shard: single transaction
    await followerShard.query('BEGIN');
    await followerShard.query(
      'INSERT INTO follows (follower_id, followee_id) VALUES ($1, $2)',
      [followerId, followeeId]
    );
    await followerShard.query(
      'UPDATE users SET following_count = following_count + 1 WHERE id = $1',
      [followerId]
    );
    await followerShard.query(
      'UPDATE users SET follower_count = follower_count + 1 WHERE id = $1',
      [followeeId]
    );
    await followerShard.query('COMMIT');
  } else {
    // Cross-shard: use saga pattern
    await followeeShard.query(
      'INSERT INTO followers (user_id, follower_id) VALUES ($1, $2)',
      [followeeId, followerId]
    );
    await followerShard.query(
      'INSERT INTO following (user_id, followee_id) VALUES ($1, $2)',
      [followerId, followeeId]
    );
  }
}

Use Case 3: E-Commerce with Order-Based Sharding

An e-commerce platform shards orders by order_id using consistent hashing. The challenge is that users need to see all their orders, which may span multiple shards.

// Global secondary index for user orders
async function getUserOrders(userId: string): Promise<any[]> {
  // Step 1: Query the global index (single shard, lightweight)
  const indexResult = await router.queryOnShard(
    userId,
    'SELECT order_id, shard_id FROM user_order_index WHERE user_id = $1 ORDER BY created_at DESC',
    [userId]
  );
 
  // Step 2: Group by shard and batch-fetch
  const byShard = new Map<number, string[]>();
  for (const row of indexResult.rows) {
    const shardId = row.shard_id;
    if (!byShard.has(shardId)) byShard.set(shardId, []);
    byShard.get(shardId)!.push(row.order_id);
  }
 
  // Step 3: Parallel fetch from each shard
  const results = await Promise.all(
    Array.from(byShard.entries()).map(([shardId, orderIds]) =>
      pools.get(shardId)!.query('SELECT * FROM orders WHERE id = ANY($1)', [orderIds])
    )
  );
 
  return results.flatMap(r => r.rows);
}

Best Practices for Production

  1. Choose your shard key carefully—it's nearly impossible to change later: Spend weeks evaluating candidates. The wrong shard key will haunt you for years.
  2. Design for single-shard queries: 95%+ of your queries should hit a single shard. Cross-shard queries should be rare and well-optimized.
  3. Maintain a global index for cross-shard lookups: A lightweight index mapping secondary keys to shard IDs enables efficient cross-shard access without scatter-gather.
  4. Use connection pooling aggressively: Each shard needs its own connection pool. With 16 shards and 20 connections each, that's 320 connections—plan accordingly.
  5. Implement circuit breakers per shard: One slow shard shouldn't cascade failures to the entire application.
  6. Monitor shard balance: Track data volume and query load per shard. Alert when any shard deviates more than 20% from the average.
  7. Plan for rebalancing from day one: Even with consistent hashing, you'll need to move data. Build tooling for live migration before you need it.
  8. Avoid distributed transactions when possible: Use saga patterns, eventual consistency, or application-level compensation instead of 2PC.

Common Pitfalls and Solutions

PitfallImpactSolution
Hot shard from skewed dataOne shard overwhelmed while others idleUse consistent hashing with virtual nodes
Cross-shard joinsNetwork-bound queries, high latencyDenormalize data or use global indexes
Distributed transactionsPerformance bottleneck, partial failuresUse saga pattern or eventual consistency
Shard exhaustionCan't add more shards without downtimeUse consistent hashing from the start
Cascading shard failureOne shard down affects all queriesCircuit breakers and graceful degradation

Performance Optimization

// Connection pool configuration per shard
const poolConfig = {
  max: 20,                    // Max connections per shard
  idleTimeoutMillis: 30000,   // Close idle connections after 30s
  connectionTimeoutMillis: 3000,
  statement_timeout: 10000,   // Kill queries after 10s
};
 
// Query caching for hot single-shard queries
import NodeCache from 'node-cache';
const queryCache = new NodeCache({ stdTTL: 60, maxKeys: 10000 });
 
async function cachedQuery(shardKey: string, sql: string, params: any[]) {
  const cacheKey = `${shardKey}:${sql}:${JSON.stringify(params)}`;
  const cached = queryCache.get(cacheKey);
  if (cached) return cached;
 
  const result = await router.queryOnShard(shardKey, sql, params);
  queryCache.set(cacheKey, result.rows);
  return result.rows;
}

Comparison with Alternatives

FeatureApplication ShardingCitus/VitessNoSQL (Cassandra)NewSQL (CockroachDB)
SQL supportFullFullLimitedFull
Cross-shard queriesManualTransparentLimitedTransparent
RebalancingManualAutomatedAutomatedAutomated
Transaction supportPer-shardCross-shardLimitedFull ACID
Operational complexityHighMediumLowLow

Testing Strategies

describe('Sharding Tests', () => {
  test('data distributes evenly across shards', async () => {
    const shardCounts = new Map<number, number>();
    for (let i = 0; i < 10000; i++) {
      const shardId = getShardId(`tenant-${i}`, 16);
      shardCounts.set(shardId, (shardCounts.get(shardId) || 0) + 1);
    }
    // Each shard should have ~625 ± 15% (750)
    for (const [shard, count] of shardCounts) {
      expect(count).toBeGreaterThan(500);
      expect(count).toBeLessThan(750);
    }
  });
 
  test('consistent hashing minimizes key movement on shard addition', async () => {
    const ring = new ConsistentHashRing(['s0', 's1', 's2', 's3']);
    const before = new Map<string, string>();
    for (let i = 0; i < 10000; i++) {
      before.set(`key-${i}`, ring.getNode(`key-${i}`));
    }
 
    ring.addNode('s4');
    let moved = 0;
    for (let i = 0; i < 10000; i++) {
      if (before.get(`key-${i}`) !== ring.getNode(`key-${i}`)) moved++;
    }
    // Should move ~20% (1/5), not 80%+ like modulo hashing
    expect(moved / 10000).toBeLessThan(0.3);
  });
});

Future Outlook

The trend is toward transparent sharding—systems like CockroachDB, TiDB, and YugabyteDB automatically shard data without application awareness. PostgreSQL's declarative partitioning continues to improve, with better partition pruning and parallel query execution. Vitess has become the standard for MySQL sharding at scale, used by YouTube, Slack, and HubSpot.

Database Performance Tuning

Database performance tuning is an iterative process that requires understanding your query patterns, data distribution, and access frequencies. Small improvements in query performance can have outsized impacts on application responsiveness, especially for frequently executed queries.

Query Plan Analysis

Always analyze query execution plans before optimizing. In PostgreSQL, use EXPLAIN ANALYZE to see both the planned and actual execution statistics:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 20;

Look for sequential scans on large tables (indicating missing indexes), high cost estimates for nested loops (suggesting join strategy issues), and excessive buffer reads (indicating insufficient memory allocation or poor data locality).

Index Optimization Strategy

Creating the right indexes requires understanding your query patterns. Use pg_stat_user_tables and pg_stat_user_indexes to identify tables with high sequential scan ratios and unused indexes:

-- Find tables with high sequential scan ratio
SELECT schemaname, relname, seq_scan, idx_scan,
       CASE WHEN seq_scan + idx_scan > 0
            THEN round(100.0 * idx_scan / (seq_scan + idx_scan), 1)
            ELSE 0 END as idx_usage_pct
FROM pg_stat_user_tables
WHERE seq_scan + idx_scan > 100
ORDER BY idx_usage_pct ASC;
 
-- Find unused indexes consuming disk space
SELECT indexrelname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_user_indexes
WHERE idx_scan < 10 AND indexrelname NOT LIKE '%pkey%'
ORDER BY pg_relation_size(indexrelid) DESC;

Connection Pooling Best Practices

Proper connection pooling prevents connection exhaustion under load. Configure your pool size based on your CPU core count and query characteristics:

Optimal pool size = (CPU cores * 2) + effective_spindle_count

For modern SSD-based systems, this typically means a pool size of 2 * CPU_cores + 1. Setting the pool too large wastes memory and increases context switching overhead, while setting it too small causes query queuing under load.

Community Resources and Further Learning

The technology landscape evolves rapidly, making continuous learning essential for maintaining expertise. Building a systematic approach to staying current with developments in your technology stack ensures you can leverage new features and avoid deprecated patterns.

Curated Learning Pathways

Rather than consuming content randomly, create structured learning pathways aligned with your current projects and career goals. Start with official documentation and specification documents, which provide the most accurate and comprehensive information. Follow this with hands-on tutorials and workshops that reinforce concepts through practical application.

Technical blogs from framework maintainers and core team members often provide deeper insights into design decisions and upcoming features. Subscribe to the official blogs of your primary frameworks and libraries to stay ahead of breaking changes and deprecation timelines.

Contributing to Open Source

Contributing to open-source projects in your technology stack provides unparalleled learning opportunities. Start with documentation improvements and bug reports, then progress to fixing small issues tagged as "good first issue" in your favorite projects. This direct engagement with maintainers and the codebase accelerates your understanding far beyond what passive learning can achieve.

# Setting up for contribution
git clone https://github.com/project/repository.git
cd repository
git checkout -b fix/issue-description
 
# Run the project's contribution setup
npm run setup:dev
npm run test  # Ensure tests pass before making changes
 
# Make your changes, then run the full test suite
npm run test:full
npm run lint
npm run build
 
# Submit your contribution
git add -A
git commit -m "fix: description of the fix
 
Closes #1234"
git push origin fix/issue-description

Building a Technical Knowledge Base

Maintain a personal knowledge base that captures insights, solutions, and patterns you discover during your work. Tools like Obsidian, Notion, or even a simple Markdown repository can serve as an external memory that grows more valuable over time.

Organize your notes by topic rather than chronologically, and include code examples, links to relevant documentation, and explanations of why certain approaches work better than others. When you encounter a particularly insightful article or conference talk, write a summary that captures the key takeaways and how they apply to your current projects.

Follow key conferences and their published talks to stay informed about emerging patterns and best practices. Many conferences publish recorded talks on YouTube within weeks of the event, making world-class technical content freely accessible.

Join relevant Discord servers, Slack communities, and forums where practitioners discuss real-world challenges and solutions. These communities provide early warning about emerging issues and access to collective wisdom that isn't available through formal documentation.

Mentorship and Knowledge Sharing

Teaching others is one of the most effective ways to deepen your own understanding. Consider writing technical blog posts, giving talks at local meetups, or mentoring junior developers. The process of explaining concepts to others forces you to organize your knowledge and identify gaps in your understanding.

Pair programming sessions with colleagues of different experience levels create mutual learning opportunities. Senior developers gain fresh perspectives on problems they've solved the same way for years, while junior developers benefit from exposure to production-grade thinking and decision-making processes.

Conclusion

Database sharding unlocks horizontal scalability but demands careful architectural planning. Key takeaways:

  1. Shard key selection is the most consequential decision—get it wrong and everything downstream suffers
  2. Design for single-shard queries; cross-shard operations should be the exception, not the rule
  3. Use consistent hashing to make future shard additions non-disruptive
  4. Build operational tooling for migration, monitoring, and rebalancing before you need it
  5. Consider transparent sharding solutions (Citus, CockroachDB) to reduce application complexity

Start by modeling your data access patterns. If 95%+ of queries can be scoped to a single shard key, sharding is a strong choice. Otherwise, consider read replicas, caching, or a transparent sharding database.