Introduction to Apache Cassandra for High Availability

Introduction

Apache Cassandra is a distributed NoSQL database designed for high availability, massive scalability, and fault tolerance. Originally developed at Facebook for their inbox search feature, Cassandra has become the database of choice for companies that need to handle massive amounts of data across multiple data centers with zero downtime. Companies like Netflix, Apple, Instagram, and Uber rely on Cassandra to manage petabytes of data across thousands of nodes.

Unlike traditional relational databases that follow ACID properties and use a single-node architecture, Cassandra uses a masterless, peer-to-peer architecture where every node is identical. This means there is no single point of failure—if a node goes down, the cluster continues operating without interruption. Data is automatically replicated across multiple nodes and data centers, ensuring that your application remains available even during hardware failures or network partitions.

Cassandra's data model is based on a partitioned row store with tunable consistency. This means you can choose your own trade-off between consistency and availability for each operation, making it suitable for a wide range of use cases from real-time analytics to IoT data storage. In this comprehensive guide, we'll explore Cassandra's architecture, data modeling techniques, replication strategies, and production best practices.

Understanding Cassandra: Core Concepts

Architecture Overview

Cassandra's architecture is fundamentally different from traditional databases. It uses a masterless ring topology where every node is equal and can handle read and write requests. This design eliminates single points of failure and allows for linear scalability—add more nodes to increase capacity.

Key architectural components include:

Node: A single Cassandra instance running on a machine
Data Center: A collection of related nodes, typically in the same geographic region
Cluster: The complete set of nodes across all data centers
Commit Log: Write-ahead log for crash recovery
Memtable: In-memory data structure for writes
SSTable: Sorted Strings Table, immutable on-disk storage

# Check cluster status with nodetool
nodetool status
 
# Output:
# Datacenter: dc1
# ===============
# Status=Up/Down
# |/ State=Normal/Leaving/Joining/Moving
# --  Address       Load       Tokens  Owns   Host ID   Rack
# UN  10.0.0.1     1.2 GB     256     33.3%  abc123    rack1
# UN  10.0.0.2     1.1 GB     256     33.3%  def456    rack1
# UN  10.0.0.3     1.3 GB     256     33.4%  ghi789    rack1

Data Model

Cassandra's data model is organized around tables with a primary key that determines data distribution and ordering. The primary key consists of a partition key (determines which node stores the data) and optional clustering columns (determine sort order within the partition).

-- Cassandra Query Language (CQL) example
CREATE TABLE users (
  user_id UUID,
  email TEXT,
  name TEXT,
  created_at TIMESTAMP,
  PRIMARY KEY (user_id)
);
 
-- Composite primary key with clustering
CREATE TABLE user_posts (
  user_id UUID,
  post_id TIMEUUID,
  title TEXT,
  content TEXT,
  created_at TIMESTAMP,
  PRIMARY KEY (user_id, post_id)
) WITH CLUSTERING ORDER BY (post_id DESC);
 
-- Multi-column partition key
CREATE TABLE sensor_data (
  sensor_id TEXT,
  date TEXT,
  event_time TIMESTAMP,
  value DOUBLE,
  PRIMARY KEY ((sensor_id, date), event_time)
) WITH CLUSTERING ORDER BY (event_time DESC);

Consistency Levels

Cassandra offers tunable consistency, allowing you to choose the right balance between availability and consistency for each operation:

Level	Description	Use Case
ONE	Acknowledged by 1 replica	High throughput, eventual consistency OK
TWO	Acknowledged by 2 replicas	Balance of speed and consistency
QUORUM	Majority of replicas	Strong consistency for most operations
ALL	All replicas	Strongest consistency, lowest availability
LOCAL_QUORUM	Majority in local DC	Multi-DC with local consistency
EACH_QUORUM	Majority in each DC	Strong multi-DC consistency

-- Write with QUORUM consistency
INSERT INTO users (user_id, email, name) 
VALUES (uuid(), 'alice@example.com', 'Alice')
USING CONSISTENCY QUORUM;
 
-- Read with LOCAL_QUORUM for multi-DC
SELECT * FROM users WHERE user_id = ?
USING CONSistency LOCAL_QUORUM;

Architecture and Design Patterns

Replication Strategies

Cassandra offers two replication strategies:

SimpleStrategy — Places replicas on the next nodes clockwise in the ring. Use only for single data center deployments.

NetworkTopologyStrategy — Places replicas with awareness of rack and data center topology. Required for multi-data center deployments.

-- Create keyspace with NetworkTopologyStrategy
CREATE KEYSPACE myapp 
WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'dc1': 3,
  'dc2': 3
};
 
-- SimpleStrategy for development
CREATE KEYSPACE myapp_dev 
WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': 3
};

Data Modeling Methodology

Cassandra data modeling is query-driven—you design tables based on your read patterns, not your entity relationships. This is fundamentally different from relational database design.

Step 1: Identify Queries

Q1: Get user by user_id
Q2: Get recent posts by user_id
Q3: Get posts by tag
Q4: Get user activity by date range

Step 2: Design Tables for Each Query

-- Q1: Get user by user_id
CREATE TABLE users_by_id (
  user_id UUID PRIMARY KEY,
  email TEXT,
  name TEXT,
  bio TEXT
);
 
-- Q2: Get recent posts by user_id
CREATE TABLE posts_by_user (
  user_id UUID,
  post_id TIMEUUID,
  title TEXT,
  content TEXT,
  tags SET<TEXT>,
  created_at TIMESTAMP,
  PRIMARY KEY (user_id, post_id)
) WITH CLUSTERING ORDER BY (post_id DESC);
 
-- Q3: Get posts by tag (denormalized)
CREATE TABLE posts_by_tag (
  tag TEXT,
  post_id TIMEUUID,
  user_id UUID,
  title TEXT,
  created_at TIMESTAMP,
  PRIMARY KEY (tag, post_id)
) WITH CLUSTERING ORDER BY (post_id DESC);
 
-- Q4: Get user activity by date
CREATE TABLE activity_by_date (
  user_id UUID,
  activity_date DATE,
  activity_id TIMEUUID,
  activity_type TEXT,
  details TEXT,
  PRIMARY KEY ((user_id, activity_date), activity_id)
) WITH CLUSTERING ORDER BY (activity_id DESC);

Denormalization Pattern

In Cassandra, denormalization is not just acceptable—it's the recommended approach. Since reads should hit a single partition for optimal performance, you often store the same data in multiple tables optimized for different queries.

-- Materialized view (auto-maintained denormalization)
CREATE MATERIALIZED VIEW users_by_email AS
  SELECT * FROM users_by_id
  WHERE email IS NOT NULL AND user_id IS NOT NULL
  PRIMARY KEY (email, user_id);

Step-by-Step Implementation

Setting Up a Cassandra Cluster

# Docker Compose for local multi-node cluster
# docker-compose.yml
version: '3.8'
services:
  cassandra-1:
    image: cassandra:4.1
    container_name: cassandra-1
    environment:
      - CASSANDRA_CLUSTER_NAME=MyCluster
      - CASSANDRA_SEEDS=cassandra-1,cassandra-2
      - CASSANDRA_DC=dc1
      - CASSANDRA_RACK=rack1
    ports:
      - "9042:9042"
    volumes:
      - cassandra1_data:/var/lib/cassandra
 
  cassandra-2:
    image: cassandra:4.1
    container_name: cassandra-2
    environment:
      - CASSANDRA_CLUSTER_NAME=MyCluster
      - CASSANDRA_SEEDS=cassandra-1,cassandra-2
      - CASSANDRA_DC=dc1
      - CASSANDRA_RACK=rack2
    depends_on:
      - cassandra-1
 
  cassandra-3:
    image: cassandra:4.1
    container_name: cassandra-3
    environment:
      - CASSANDRA_CLUSTER_NAME=MyCluster
      - CASSANDRA_SEEDS=cassandra-1,cassandra-2
      - CASSANDRA_DC=dc1
      - CASSANDRA_RACK=rack1
    depends_on:
      - cassandra-2
 
volumes:
  cassandra1_data:

Node.js Application with Cassandra

// Using DataStax Node.js driver
import { Client, auth } from 'cassandra-driver';
 
// Connection configuration
const client = new Client({
  contactPoints: ['cassandra-1:9042', 'cassandra-2:9042', 'cassandra-3:9042'],
  localDataCenter: 'dc1',
  keyspace: 'myapp',
  pooling: {
    coreConnectionsPerHost: {
      [distance.local]: 2,
      [distance.remote]: 1,
    },
  },
  policies: {
    retry: new policies.retry.RetryPolicy(),
    loadBalancing: new policies.loadBalancing.DCAwareRoundRobinPolicy('dc1'),
  },
});
 
// Initialize schema
async function initializeSchema() {
  await client.execute(`
    CREATE KEYSPACE IF NOT EXISTS myapp 
    WITH replication = {
      'class': 'NetworkTopologyStrategy',
      'dc1': 3
    }
  `);
 
  await client.execute(`
    CREATE TABLE IF NOT EXISTS myapp.users (
      user_id UUID PRIMARY KEY,
      email TEXT,
      name TEXT,
      created_at TIMESTAMP
    )
  `);
 
  await client.execute(`
    CREATE TABLE IF NOT EXISTS myapp.posts_by_user (
      user_id UUID,
      post_id TIMEUUID,
      title TEXT,
      content TEXT,
      tags SET<TEXT>,
      created_at TIMESTAMP,
      PRIMARY KEY (user_id, post_id)
    ) WITH CLUSTERING ORDER BY (post_id DESC)
  `);
}
 
// Repository pattern for data access
class UserRepository {
  async findById(userId: string): Promise<User | null> {
    const result = await client.execute(
      'SELECT * FROM users WHERE user_id = ?',
      [userId],
      { prepare: true, consistency: consistency.quorum }
    );
    return result.rows[0] || null;
  }
 
  async create(user: CreateUserDto): Promise<string> {
    const userId = Uuid.random();
    await client.execute(
      'INSERT INTO users (user_id, email, name, created_at) VALUES (?, ?, ?, ?)',
      [userId, user.email, user.name, new Date()],
      { prepare: true, consistency: consistency.localQuorum }
    );
    return userId.toString();
  }
}
 
class PostRepository {
  async findByUser(userId: string, limit: number = 20): Promise<Post[]> {
    const result = await client.execute(
      'SELECT * FROM posts_by_user WHERE user_id = ? LIMIT ?',
      [userId, limit],
      { prepare: true, fetchSize: limit }
    );
    return result.rows;
  }
 
  async create(userId: string, post: CreatePostDto): Promise<void> {
    const postId = TimeUuid.now();
    const queries = [
      {
        query: 'INSERT INTO posts_by_user (user_id, post_id, title, content, tags, created_at) VALUES (?, ?, ?, ?, ?, ?)',
        params: [userId, postId, post.title, post.content, post.tags, new Date()],
      },
    ];
 
    // Denormalization: also write to posts_by_tag table
    for (const tag of post.tags) {
      queries.push({
        query: 'INSERT INTO posts_by_tag (tag, post_id, user_id, title, created_at) VALUES (?, ?, ?, ?, ?)',
        params: [tag, postId, userId, post.title, new Date()],
      });
    }
 
    await client.batch(queries, { prepare: true, consistency: consistency.localQuorum });
  }
}

Connection Pooling and Retry Logic

// Production-ready Cassandra client with retry and circuit breaker
import { Client, policies } from 'cassandra-driver';
import CircuitBreaker from 'opossum';
 
const retryPolicy = new policies.retry.RetryPolicy();
 
const client = new Client({
  contactPoints: process.env.CASSANDRA_HOSTS?.split(',') || ['localhost'],
  localDataCenter: process.env.CASSANDRA_DC || 'dc1',
  keyspace: process.env.CASSANDRA_KEYSPACE || 'myapp',
  queryOptions: {
    consistency: 1, // ONE by default
    serialConsistency: 2, // SERIAL
  },
  pooling: {
    coreConnectionsPerHost: {
      [0]: 3, // LOCAL
      [1]: 1, // REMOTE
    },
  },
  socketOptions: {
    connectTimeout: 5000,
    readTimeout: 12000,
  },
});
 
// Circuit breaker for Cassandra operations
const cassandraBreaker = new CircuitBreaker(async (query: string, params: any[]) => {
  return client.execute(query, params, { prepare: true });
}, {
  timeout: 5000,
  errorThresholdPercentage: 50,
  resetTimeout: 30000,
});
 
cassandraBreaker.on('open', () => console.warn('Cassandra circuit breaker OPEN'));
cassandraBreaker.on('halfOpen', () => console.info('Cassandra circuit breaker HALF_OPEN'));
cassandraBreaker.on('close', () => console.info('Cassandra circuit breaker CLOSED'));
 
export async function executeQuery(query: string, params: any[] = [], consistencyLevel?: number) {
  try {
    const result = await cassandraBreaker.fire(query, params);
    return result;
  } catch (error) {
    console.error('Cassandra query failed:', error);
    throw error;
  }
}

Real-World Use Cases and Case Studies

Use Case 1: IoT Time-Series Data Storage

Cassandra excels at storing and querying time-series data from IoT devices:

-- Time-series table design for sensor data
CREATE TABLE sensor_readings (
  sensor_id TEXT,
  date TEXT,           -- Partition by day for manageable partition sizes
  reading_time TIMESTAMP,
  temperature DOUBLE,
  humidity DOUBLE,
  battery_level INT,
  PRIMARY KEY ((sensor_id, date), reading_time)
) WITH CLUSTERING ORDER BY (reading_time DESC)
  AND default_time_to_live = 7776000  -- 90 days TTL
  AND compaction = {
    'class': 'TimeWindowCompactionStrategy',
    'compaction_window_size': 1,
    'compaction_window_unit': 'DAYS'
  };
 
-- Query: Get today's readings for a sensor
SELECT * FROM sensor_readings 
WHERE sensor_id = 'sensor-001' AND date = '2024-01-15'
LIMIT 100;
 
-- Query: Get readings from last 6 hours
SELECT * FROM sensor_readings 
WHERE sensor_id = 'sensor-001' AND date = '2024-01-15'
AND reading_time > '2024-01-15T12:00:00';

// Batch insert for high-throughput ingestion
async function ingestSensorData(readings: SensorReading[]) {
  const queries = readings.map(reading => ({
    query: `INSERT INTO sensor_readings 
            (sensor_id, date, reading_time, temperature, humidity, battery_level) 
            VALUES (?, ?, ?, ?, ?, ?)`,
    params: [
      reading.sensorId,
      reading.date,
      reading.timestamp,
      reading.temperature,
      reading.humidity,
      reading.batteryLevel,
    ],
  }));
 
  // Execute in batches of 50 (Cassandra recommended batch size)
  const batchSize = 50;
  for (let i = 0; i < queries.length; i += batchSize) {
    const batch = queries.slice(i, i + batchSize);
    await client.batch(batch, { prepare: true });
  }
}

Use Case 2: User Activity Feed

-- Activity feed with efficient pagination
CREATE TABLE user_activity_feed (
  user_id UUID,
  activity_id TIMEUUID,
  actor_id UUID,
  actor_name TEXT,
  activity_type TEXT,  -- 'like', 'comment', 'follow', 'share'
  target_id UUID,
  target_type TEXT,    -- 'post', 'comment', 'user'
  metadata TEXT,       -- JSON blob for flexible data
  created_at TIMESTAMP,
  PRIMARY KEY (user_id, activity_id)
) WITH CLUSTERING ORDER BY (activity_id DESC);
 
-- Cursor-based pagination
SELECT * FROM user_activity_feed
WHERE user_id = ? AND activity_id < ?
LIMIT 20;

Use Case 3: E-Commerce Product Catalog

-- Product catalog with multiple access patterns
CREATE TABLE products_by_category (
  category_id UUID,
  product_id TIMEUUID,
  name TEXT,
  price DECIMAL,
  brand TEXT,
  rating FLOAT,
  in_stock BOOLEAN,
  attributes MAP<TEXT, TEXT>,
  PRIMARY KEY (category_id, product_id)
) WITH CLUSTERING ORDER BY (product_id DESC);
 
-- Price range queries (using SASI index)
CREATE CUSTOM INDEX ON products_by_category (price) 
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {
  'mode': 'SPARSE',
  'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer'
};

Best Practices for Production

Keep partition size under 100MB — Large partitions cause performance issues. Use composite partition keys or bucketing (e.g., by date) to manage partition size.
Avoid ALLOW FILTERING — It scans all partitions and performs poorly. Design your tables to support your queries directly.
Use prepared statements — Always prepare your CQL statements for better performance and protection against CQL injection.
Set appropriate TTLs — Use time-to-live for data that should expire automatically. This reduces storage costs and improves compaction efficiency.
Monitor tombstones — Deleted data creates tombstones that affect read performance. Design deletion patterns carefully and use TTL where possible.
Use LOCAL_QUORUM for multi-DC — This ensures consistency within the local data center while allowing the remote DC to lag slightly.
Batch only for denormalization — Cassandra batches are not for performance; they're for atomicity across related tables in the same partition.
Size your clusters appropriately — Plan for 70% capacity to allow for repairs, compaction, and unexpected load spikes.

Common Pitfalls and Solutions

Pitfall	Impact	Solution
Hot partitions	Uneven load, slow reads	Use composite keys or bucketing to distribute data evenly
Large partitions	OOM errors, slow compaction	Keep partitions under 100MB by partitioning data by time
Unbounded queries	Full partition scans	Always set LIMIT on queries
Ignoring tombstones	Read performance degradation	Use TTL, avoid frequent deletes, schedule regular repairs
Wrong consistency level	Data inconsistency or availability issues	Match consistency to your requirements (QUORUM for most cases)
Single data center	No disaster recovery	Deploy across multiple DCs with NetworkTopologyStrategy

Performance Optimization

-- Compaction strategies by use case
 
-- Default: LeveledCompactionStrategy (read-heavy)
CREATE TABLE read_heavy_table (...)
WITH compaction = {'class': 'LeveledCompactionStrategy'};
 
-- Time-series: TimeWindowCompactionStrategy
CREATE TABLE timeseries_table (...)
WITH compaction = {
  'class': 'TimeWindowCompactionStrategy',
  'compaction_window_size': 1,
  'compaction_window_unit': 'DAYS'
};
 
-- Write-heavy with TTL: SizeTieredCompactionStrategy
CREATE TABLE write_heavy_table (...)
WITH compaction = {'class': 'SizeTieredCompactionStrategy'}
AND default_time_to_live = 86400;

// Connection tuning for high throughput
const client = new Client({
  contactPoints: ['node1', 'node2', 'node3'],
  pooling: {
    coreConnectionsPerHost: {
      [distance.local]: 6,  // Increase for high throughput
      [distance.remote]: 2,
    },
    maxConnectionsPerHost: {
      [distance.local]: 12,
      [distance.remote]: 4,
    },
  },
  socketOptions: {
    keepAlive: true,
    keepAliveDelay: 30000,
    tcpNoDelay: true,
  },
  // Request-level timeout
  queryOptions: {
    readTimeout: 12000,
    consistency: consistency.localQuorum,
  },
});

Comparison with Alternatives

Feature	Cassandra	MongoDB	DynamoDB	ScyllaDB
Architecture	Masterless ring	Replica set	Managed service	Masterless ring
Multi-DC	Native	Manual setup	Global tables	Native
Consistency	Tunable (per-query)	Eventual/Strong	Eventual/Strong	Tunable
Query Language	CQL	MQL	API	CQL
Throughput	Very high	High	High (provisioned)	Very high
Latency	Low (p99)	Low (p99)	Low (p99)	Very low (p99)
Cost	Self-managed	Self-managed/AWS	AWS pricing	Self-managed
Best For	Time-series, IoT, high-write	Documents, flexible schema	Serverless, AWS-native	High-performance Cassandra

Advanced Patterns and Techniques

// Lightweight transactions (LWT) for conditional updates
const result = await client.execute(
  `INSERT INTO users (user_id, email, name) 
   VALUES (?, ?, ?) 
   IF NOT EXISTS`,
  [userId, email, name],
  { prepare: true }
);
 
if (result.rows[0]['[applied]']) {
  console.log('User created successfully');
} else {
  console.log('Email already exists');
}
 
// Counter tables for analytics
await client.execute(`
  CREATE TABLE page_views (
    page_id UUID,
    view_date DATE,
    view_count COUNTER,
    PRIMARY KEY (page_id, view_date)
  )
`);
 
await client.execute(
  `UPDATE page_views SET view_count = view_count + 1 
   WHERE page_id = ? AND view_date = ?`,
  [pageId, currentDate],
  { prepare: true }
);

Testing Strategies

import { Client } from 'cassandra-driver';
 
// Test with real Cassandra using Testcontainers
import { GenericContainer, StartedTestContainer } from 'testcontainers';
 
let cassandra: StartedTestContainer;
let client: Client;
 
beforeAll(async () => {
  cassandra = await new GenericContainer('cassandra:4.1')
    .withExposedPorts(9042)
    .withEnvironment({ CASSANDRA_DC: 'dc1' })
    .withWaitStrategy(Wait.forLogMessage('Created default superuser'))
    .start();
  
  client = new Client({
    contactPoints: [`${cassandra.getHost()}:${cassandra.getMappedPort(9042)}`],
    localDataCenter: 'dc1',
  });
  
  await client.connect();
  await initializeTestSchema();
}, 120000);
 
afterAll(async () => {
  await client.shutdown();
  await cassandra.stop();
});
 
describe('UserRepository', () => {
  it('should create and retrieve a user', async () => {
    const repo = new UserRepository(client);
    const userId = await repo.create({ email: 'test@test.com', name: 'Test User' });
    const user = await repo.findById(userId);
    
    expect(user).toBeDefined();
    expect(user.email).toBe('test@test.com');
  });
 
  it('should enforce unique email with LWT', async () => {
    const repo = new UserRepository(client);
    await repo.create({ email: 'unique@test.com', name: 'User 1' });
    
    await expect(repo.create({ email: 'unique@test.com', name: 'User 2' }))
      .rejects.toThrow('already exists');
  });
});

Future Outlook

Cassandra continues to evolve with Cassandra 5.0 bringing features like Storage Attached Indexes (SAI), dynamic data masking, and improved streaming. ScyllaDB, a C++ reimplementation, offers drop-in compatibility with significantly better performance.

The Cassandra ecosystem is expanding with tools like K8ssandra (Kubernetes operator), Apache Cassandra Reaper for repairs, and Stargate as an API gateway. Cloud-managed options like Astra DB (DataStax) and Amazon Keyspaces simplify operations.

Conclusion

Apache Cassandra is an excellent choice for applications requiring high availability, massive scalability, and fault tolerance. The key takeaways are:

Query-driven data modeling — Design tables based on your read patterns, not entity relationships
Denormalization is normal — Store data in multiple tables optimized for different queries
Tunable consistency — Choose the right consistency level for each operation
Masterless architecture — Every node is equal, eliminating single points of failure
Multi-data center support — Native replication across geographically distributed clusters

Start with a small cluster in Docker, practice CQL data modeling, and experiment with different consistency levels before deploying to production. Understanding the data modeling methodology is the most critical skill for Cassandra success.

Minh Vo

Slaying code & making it lit fr fr 🔥 tagline