MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

Event-Driven Architecture: Patterns and Implementation

Design event-driven systems: event sourcing, CQRS, message queues, and saga patterns.

Event-DrivenArchitectureCQRSMicroservices

By MinhVo

Introduction

Event-Driven Architecture (EDA) represents a paradigm shift in how we design distributed systems. Instead of services making direct synchronous calls to each other, components communicate by producing and consuming events—immutable records of facts that have occurred. This fundamental change in communication patterns unlocks resilience, scalability, and loose coupling that traditional request-response architectures simply cannot achieve. In this comprehensive guide, we will explore the core patterns of event-driven architecture, including event sourcing, CQRS, message queues, and the saga pattern, along with practical TypeScript implementations you can apply to real production systems.

The rise of microservices and cloud-native applications has made EDA more relevant than ever. As systems grow in complexity, the tight coupling between services becomes a bottleneck for development velocity and system reliability. Event-driven architectures decouple producers from consumers, allowing teams to build, deploy, and scale services independently. Whether you are building a real-time analytics pipeline, an e-commerce platform, or a financial trading system, understanding EDA patterns is essential for modern software engineering.

Event-Driven Architecture Overview

Understanding Event-Driven Architecture: Core Concepts

At its core, event-driven architecture revolves around three fundamental concepts: events, event producers, and event consumers. An event is an immutable fact that represents something that happened in the system—OrderPlaced, PaymentProcessed, or UserRegistered. Unlike commands, which request an action to be performed, events are notifications that an action has already occurred. This distinction is critical because it determines how tightly coupled your system components are.

The event flow in an EDA system follows a predictable pattern. A producer emits an event to a message broker or event store. The broker persists the event and makes it available to consumers. Consumers subscribe to event types they care about and process them asynchronously. The producer never needs to know who consumes its events, and consumers never need to know who produced them. This decoupling is the primary architectural benefit of EDA.

There are two fundamental messaging patterns: publish-subscribe and point-to-point. In publish-subscribe, multiple consumers can subscribe to the same event type, and each receives a copy. This is ideal for broadcasting state changes to multiple interested parties. In point-to-point, events are delivered to exactly one consumer from a group, which is useful for distributing work across a pool of workers. Modern message brokers like Apache Kafka, RabbitMQ, and cloud services like AWS EventBridge support both patterns.

Event sourcing takes the event concept further by making events the primary source of truth. Instead of storing the current state of an entity, you store the complete sequence of events that led to the current state. This provides a full audit trail, enables temporal queries ("what was the state at time T?"), and allows you to rebuild state by replaying events. Event sourcing is often paired with CQRS, which separates the write model (where events are produced) from the read model (where events are projected into queryable views).

Message Broker Architecture

Architecture and Design Patterns

Event Sourcing Pattern

Event sourcing stores every state change as an event in an append-only log. The current state of an entity is derived by replaying all events for that entity. This pattern provides a complete history of changes and enables powerful capabilities like event replay, temporal queries, and debugging by examining the event stream.

CQRS Pattern

Command Query Responsibility Segregation separates the write side (commands that produce events) from the read side (queries that consume projected data). The write model optimizes for business logic and consistency, while the read model optimizes for query performance with denormalized views. This separation allows each side to be scaled independently.

Saga Pattern

Sagas coordinate distributed transactions across multiple services without using two-phase commit. A saga is a sequence of local transactions, each of which publishes an event that triggers the next step. If any step fails, compensating transactions undo the work of previous steps. There are two coordination approaches: choreography (events trigger the next service directly) and orchestration (a central coordinator manages the workflow).

Transactional Outbox Pattern

When a service needs to update its database and publish an event atomically, the transactional outbox pattern ensures both happen or neither does. The service writes the event to an outbox table in the same database transaction as the business data change. A separate process polls the outbox and publishes events to the message broker, then marks them as published.

Event Streaming Pattern

Rather than processing events one at a time, event streaming treats the event log as a continuous stream that can be filtered, transformed, and aggregated in real-time. Platforms like Apache Kafka Streams and Apache Flink enable complex event processing, windowed aggregations, and stream joins.

Step-by-Step Implementation

Let us build a complete event-driven order processing system using TypeScript. We will implement event sourcing with an in-memory event store, CQRS with separate write and read models, and a saga for coordinating distributed transactions.

First, let us define the core event store that persists events:

interface DomainEvent {
  eventId: string;
  aggregateId: string;
  type: string;
  data: Record<string, unknown>;
  metadata: {
    correlationId: string;
    causationId?: string;
    timestamp: Date;
    version: number;
  };
}
 
class EventStore {
  private events: DomainEvent[] = [];
  private subscriptions: Map<string, ((event: DomainEvent) => Promise<void>)[]> = new Map();
 
  async append(event: DomainEvent): Promise<void> {
    // Optimistic concurrency: check version
    const existing = this.events.filter(e => e.aggregateId === event.aggregateId);
    const lastVersion = existing.length > 0
      ? existing[existing.length - 1].metadata.version
      : 0;
    if (event.metadata.version !== lastVersion + 1) {
      throw new Error(
        `Concurrency conflict: expected version ${lastVersion + 1}, got ${event.metadata.version}`
      );
    }
    this.events.push(event);
    await this.publish(event);
  }
 
  async getEvents(aggregateId: string): Promise<DomainEvent[]> {
    return this.events.filter(e => e.aggregateId === aggregateId);
  }
 
  subscribe(eventType: string, handler: (event: DomainEvent) => Promise<void>): void {
    const handlers = this.subscriptions.get(eventType) || [];
    handlers.push(handler);
    this.subscriptions.set(eventType, handlers);
  }
 
  private async publish(event: DomainEvent): Promise<void> {
    const handlers = this.subscriptions.get(event.type) || [];
    await Promise.allSettled(handlers.map(h => h(event)));
  }
}

Next, implement an aggregate root with event sourcing:

abstract class AggregateRoot {
  protected id: string = '';
  private version: number = 0;
  private uncommittedEvents: DomainEvent[] = [];
 
  protected applyEvent(event: Omit<DomainEvent, 'metadata'>): void {
    const domainEvent: DomainEvent = {
      ...event,
      metadata: {
        correlationId: crypto.randomUUID(),
        timestamp: new Date(),
        version: this.version + 1,
      },
    };
    this.apply(domainEvent);
    this.uncommittedEvents.push(domainEvent);
  }
 
  loadFromHistory(events: DomainEvent[]): void {
    events.forEach(event => this.apply(event));
  }
 
  private apply(event: DomainEvent): void {
    this.when(event);
    this.version = event.metadata.version;
  }
 
  protected abstract when(event: DomainEvent): void;
 
  getUncommittedEvents(): DomainEvent[] {
    return [...this.uncommittedEvents];
  }
 
  clearUncommittedEvents(): void {
    this.uncommittedEvents = [];
  }
 
  getVersion(): number {
    return this.version;
  }
}

Now let us create the Order aggregate that uses these foundations:

type OrderStatus = 'created' | 'confirmed' | 'shipped' | 'cancelled';
 
class Order extends AggregateRoot {
  private status: OrderStatus = 'created';
  private items: Array<{ productId: string; quantity: number; price: number }> = [];
  private totalAmount: number = 0;
 
  static create(orderId: string, items: Array<{ productId: string; quantity: number; price: number }>): Order {
    const order = new Order();
    order.id = orderId;
    const totalAmount = items.reduce((sum, item) => sum + item.quantity * item.price, 0);
    order.applyEvent({
      eventId: crypto.randomUUID(),
      aggregateId: orderId,
      type: 'OrderCreated',
      data: { items, totalAmount },
    });
    return order;
  }
 
  confirm(): void {
    if (this.status !== 'created') {
      throw new Error(`Cannot confirm order in ${this.status} status`);
    }
    this.applyEvent({
      eventId: crypto.randomUUID(),
      aggregateId: this.id,
      type: 'OrderConfirmed',
      data: { confirmedAt: new Date().toISOString() },
    });
  }
 
  cancel(reason: string): void {
    if (this.status === 'shipped') {
      throw new Error('Cannot cancel shipped order');
    }
    this.applyEvent({
      eventId: crypto.randomUUID(),
      aggregateId: this.id,
      type: 'OrderCancelled',
      data: { reason, cancelledAt: new Date().toISOString() },
    });
  }
 
  protected when(event: DomainEvent): void {
    switch (event.type) {
      case 'OrderCreated':
        this.items = event.data.items as typeof this.items;
        this.totalAmount = event.data.totalAmount as number;
        this.status = 'created';
        break;
      case 'OrderConfirmed':
        this.status = 'confirmed';
        break;
      case 'OrderCancelled':
        this.status = 'cancelled';
        break;
    }
  }
}

Now implement the CQRS read model with projections:

interface OrderReadModel {
  orderId: string;
  status: OrderStatus;
  items: Array<{ productId: string; quantity: number; price: number }>;
  totalAmount: number;
  createdAt: string;
  confirmedAt?: string;
  cancelledAt?: string;
}
 
class OrderProjection {
  private readModels: Map<string, OrderReadModel> = new Map();
 
  async handle(event: DomainEvent): Promise<void> {
    switch (event.type) {
      case 'OrderCreated': {
        this.readModels.set(event.aggregateId, {
          orderId: event.aggregateId,
          status: 'created',
          items: event.data.items as OrderReadModel['items'],
          totalAmount: event.data.totalAmount as number,
          createdAt: event.metadata.timestamp.toISOString(),
        });
        break;
      }
      case 'OrderConfirmed': {
        const model = this.readModels.get(event.aggregateId);
        if (model) {
          model.status = 'confirmed';
          model.confirmedAt = event.data.confirmedAt as string;
        }
        break;
      }
      case 'OrderCancelled': {
        const model = this.readModels.get(event.aggregateId);
        if (model) {
          model.status = 'cancelled';
          model.cancelledAt = event.data.cancelledAt as string;
        }
        break;
      }
    }
  }
 
  getOrder(orderId: string): OrderReadModel | undefined {
    return this.readModels.get(orderId);
  }
 
  getAllOrders(): OrderReadModel[] {
    return Array.from(this.readModels.values());
  }
}

Finally, let us implement a saga orchestrator for the order fulfillment process:

interface SagaStep<T> {
  name: string;
  execute: (context: T) => Promise<void>;
  compensate: (context: T) => Promise<void>;
}
 
class SagaOrchestrator<T> {
  private steps: SagaStep<T>[] = [];
  private completedSteps: string[] = [];
 
  addStep(step: SagaStep<T>): this {
    this.steps.push(step);
    return this;
  }
 
  async execute(context: T): Promise<{ success: boolean; error?: Error }> {
    this.completedSteps = [];
    try {
      for (const step of this.steps) {
        await step.execute(context);
        this.completedSteps.push(step.name);
      }
      return { success: true };
    } catch (error) {
      await this.compensate(context);
      return { success: false, error: error as Error };
    }
  }
 
  private async compensate(context: T): Promise<void> {
    for (const stepName of [...this.completedSteps].reverse()) {
      const step = this.steps.find(s => s.name === stepName);
      if (step) {
        try {
          await step.compensate(context);
        } catch (compensationError) {
          console.error(`Compensation failed for step ${stepName}:`, compensationError);
        }
      }
    }
  }
}

Real-World Use Cases and Case Studies

Use Case 1: E-Commerce Order Processing

In an e-commerce platform, placing an order triggers a cascade of events: inventory reservation, payment processing, fraud detection, shipping label generation, and customer notification. With EDA, the order service emits an OrderCreated event, and each downstream service reacts independently. If the payment service is temporarily unavailable, the event waits in the queue and is processed when the service recovers. This prevents cascading failures and ensures no orders are lost.

Use Case 2: Real-Time Analytics Pipeline

A social media platform processes billions of user interactions daily—likes, comments, shares, and page views. Each interaction is an event that flows through Apache Kafka to multiple consumers: a real-time trending algorithm, a personalized feed generator, and a data warehouse for analytics. The event-driven approach allows each consumer to process events at its own pace without affecting others.

Use Case 3: Financial Trading System

Stock exchanges use event sourcing to maintain a complete, auditable record of every trade, order modification, and cancellation. The event log is the authoritative source of truth, and the current state of any order book can be reconstructed by replaying events. Regulators can audit the system by examining the event stream, and the system can recover from failures by replaying events from a checkpoint.

Use Case 4: IoT Device Management

A fleet of IoT sensors generates millions of telemetry readings per second. Each reading is an event that triggers threshold alerts, feeds machine learning models for predictive maintenance, and updates real-time dashboards. The event-driven architecture handles the massive throughput by distributing events across partitions, allowing horizontal scaling of consumers.

Best Practices for Production

  1. Design idempotent consumers: Network failures can cause duplicate event delivery. Every consumer must handle the same event multiple times without producing incorrect results. Track processed event IDs in a deduplication store and skip events that have already been processed.

  2. Use schema evolution with backward compatibility: Define event schemas using Avro, Protobuf, or JSON Schema, and enforce backward compatibility rules. Adding new fields with defaults is safe; removing or renaming fields breaks consumers. Use a schema registry to manage versions.

  3. Implement the transactional outbox pattern: Never publish events directly from business logic. Instead, write events to an outbox table in the same database transaction as the state change, then use a separate publisher process to relay events to the message broker.

  4. Set up dead letter queues: When a consumer cannot process an event after multiple retries, route the event to a dead letter queue instead of blocking the entire stream. Monitor dead letter queues and alert on new entries for investigation.

  5. Use correlation IDs for distributed tracing: Attach a correlation ID to the initial event and propagate it through all downstream events. This enables end-to-end tracing of complex workflows across multiple services, making debugging significantly easier.

  6. Implement snapshots for event-sourced aggregates: When an aggregate accumulates thousands of events, loading it requires replaying the entire stream. Create snapshots at regular intervals (e.g., every 100 events) to reduce load time. Store snapshots alongside events and replay only events after the last snapshot.

  7. Monitor consumer lag: Track the difference between the latest event offset and each consumer's current offset. Increasing lag indicates a consumer falling behind, which may require scaling up consumer instances or optimizing processing logic.

  8. Plan for event ordering: Events within a single aggregate must be processed in order, but events across aggregates can be processed in parallel. Use partition keys based on aggregate ID to ensure ordering where needed while maximizing parallelism.

Common Pitfalls and Solutions

PitfallImpactSolution
Treating events as commandsTight coupling between producer and consumer logicEvents should be facts about the past, not instructions for the future
Missing idempotency in consumersDuplicate processing causes incorrect stateTrack processed event IDs; implement deduplication
Event schema changes without versioningBreaking changes cause consumer failuresUse a schema registry with backward compatibility enforcement
No dead letter queue handlingPoison events block the entire consumer groupImplement DLQ with monitoring and alerting
Unbounded event streamsAggregate loading becomes slower over timeImplement snapshotting at regular intervals
Tight coupling through event dataEvents contain too much or too little data for consumersDesign events with consumer needs in mind; use event-carried state transfer

Performance Optimization

Optimizing an event-driven system involves tuning at multiple levels: event publishing throughput, broker configuration, consumer parallelism, and read model materialization.

// Batch event publishing for high throughput
class BatchEventPublisher {
  private buffer: DomainEvent[] = [];
  private flushInterval: NodeJS.Timeout | null = null;
 
  constructor(
    private store: EventStore,
    private batchSize: number = 100,
    private intervalMs: number = 1000
  ) {}
 
  publish(event: DomainEvent): void {
    this.buffer.push(event);
    if (this.buffer.length >= this.batchSize) {
      this.flush();
    } else if (!this.flushInterval) {
      this.flushInterval = setTimeout(() => this.flush(), this.intervalMs);
    }
  }
 
  private async flush(): Promise<void> {
    if (this.flushInterval) {
      clearTimeout(this.flushInterval);
      this.flushInterval = null;
    }
    const batch = this.buffer.splice(0);
    await Promise.all(batch.map(event => this.store.append(event)));
  }
}

Key performance strategies include partitioning events by aggregate ID for parallel processing, using compacted topics for read model snapshots, implementing backpressure mechanisms to prevent consumer overload, and batching database writes for projection updates.

Comparison with Alternatives

FeatureEvent-DrivenRequest-ResponseShared Database
CouplingLooseTightVery Tight
ScalabilityHighMediumLow
ResilienceHighLowMedium
ConsistencyEventualStrongStrong
ComplexityHighLowMedium
Audit TrailBuilt-inManualManual
DebuggingHarderEasierEasier
Team IndependenceHighLowLow

Event-driven architecture is not always the right choice. For simple CRUD applications with low complexity, request-response patterns are simpler and sufficient. EDA shines when you need loose coupling, scalability, resilience, and a complete audit trail. The additional complexity is justified when multiple services need to react to the same events or when the system must handle high throughput with fault tolerance.

Advanced Patterns

Event Upcasting

When event schemas evolve, old events stored in the event store may have outdated structures. Event upcasting transforms old event versions into the current version during replay. This allows you to evolve your domain model without migrating historical events.

class EventUpcaster {
  private upcasters: Map<string, (data: Record<string, unknown>) => Record<string, unknown>> = new Map();
 
  register(eventType: string, fromVersion: number, transform: (data: Record<string, unknown>) => Record<string, unknown>): void {
    this.upcasters.set(`${eventType}_v${fromVersion}`, transform);
  }
 
  upcast(event: DomainEvent): DomainEvent {
    const key = `${event.type}_v${event.metadata.version}`;
    const upcaster = this.upcasters.get(key);
    if (upcaster) {
      return { ...event, data: upcaster(event.data) };
    }
    return event;
  }
}

Event Replay and Projections Rebuilding

One of the most powerful capabilities of event sourcing is the ability to rebuild projections from scratch by replaying all events. This is invaluable when you need to fix a bug in your projection logic or add a new read model for existing data.

Testing Strategies

Testing event-driven systems requires a different approach than testing traditional architectures. Focus on testing that aggregates produce the correct events for given commands, and that event handlers produce the correct side effects.

// Testing event-sourced aggregates
describe('Order Aggregate', () => {
  it('should emit OrderCreated when creating an order', () => {
    const order = Order.create('order-1', [
      { productId: 'prod-1', quantity: 2, price: 25.00 },
    ]);
 
    const events = order.getUncommittedEvents();
    expect(events).toHaveLength(1);
    expect(events[0].type).toBe('OrderCreated');
    expect(events[0].data.totalAmount).toBe(50.00);
  });
 
  it('should replay events correctly', () => {
    const events: DomainEvent[] = [
      { eventId: '1', aggregateId: 'order-1', type: 'OrderCreated', data: { items: [], totalAmount: 0 }, metadata: { correlationId: 'c1', timestamp: new Date(), version: 1 } },
      { eventId: '2', aggregateId: 'order-1', type: 'OrderConfirmed', data: { confirmedAt: new Date().toISOString() }, metadata: { correlationId: 'c1', timestamp: new Date(), version: 2 } },
    ];
 
    const order = new Order();
    order.loadFromHistory(events);
 
    expect(() => order.cancel('test')).toThrow('Cannot cancel shipped order');
  });
});

Future Outlook

Event-driven architecture continues to evolve with the adoption of serverless event processing, event mesh topologies, and standardized event formats like CloudEvents. The CloudEvents specification provides a vendor-neutral format for event data, improving interoperability between different systems and platforms.

The integration of event streaming with machine learning is an exciting frontier. Real-time event streams can feed online learning models, trigger inference pipelines, and enable adaptive systems that respond to patterns in the event stream. As edge computing grows, event-driven patterns will extend to processing events at the network edge before they reach central systems.

Conclusion

Event-driven architecture provides a powerful foundation for building scalable, resilient, and loosely coupled systems. The patterns we explored—event sourcing, CQRS, sagas, and the transactional outbox—address real challenges in distributed systems and have been proven at scale by companies like Netflix, Uber, and LinkedIn.

Key takeaways: (1) Event sourcing provides a complete audit trail and enables temporal queries; (2) CQRS separates read and write concerns for independent optimization; (3) Saga patterns coordinate distributed transactions without tight coupling; (4) The transactional outbox ensures atomicity between state changes and event publishing; (5) Idempotent consumers and dead letter queues are essential for production reliability.

Start small by introducing events for cross-service communication, then progressively adopt event sourcing and CQRS as your system complexity demands it. The investment in event-driven patterns pays dividends in system resilience, team autonomy, and the ability to evolve your architecture over time.