MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

Temporal.io: Durable Workflow Orchestration

Build durable workflows: activities, retries, sagas, and workflow-as-code patterns.

TemporalWorkflowsOrchestrationBackend

By MinhVo

Introduction

Modern distributed systems face a fundamental challenge: coordinating long-running business processes across multiple services while handling failures gracefully. A payment flow might involve checking inventory, charging a credit card, updating order status, sending a confirmation email, and notifying a warehouse—each step calling a different service. If the email service goes down after the payment succeeds, what happens? If the warehouse notification fails, do you refund the payment? These questions lead developers into the treacherous world of distributed transactions, compensating actions, and idempotency concerns.

Temporal.io solves this problem by providing durable workflow execution. Instead of orchestrating services with fragile message queues and retry logic, you write workflows as regular code—functions that call other functions. Temporal ensures that these workflows execute to completion, even if servers crash, networks partition, or processes restart. If a workflow is interrupted at step three of five, Temporal resumes it from step three when the system recovers. No data is lost, no step is skipped, and no step is executed twice.

The core innovation of Temporal is "workflow as code." Unlike traditional workflow engines that use XML, YAML, or visual designers to define workflows, Temporal lets you write workflows in Go, Java, TypeScript, Python, or PHP. This means you get the full power of your programming language—loops, conditionals, error handling, type checking, and testing frameworks—combined with the durability guarantees of a workflow engine.

This guide covers everything from basic workflow concepts to advanced patterns like sagas, child workflows, and production deployment. We will explore the architecture that makes Temporal unique, walk through real-world implementations, and discuss the trade-offs that come with durable execution.

Temporal.io architecture

Understanding Temporal: Core Concepts

Workflows and Activities

Temporal separates two types of code: workflows and activities. Workflows are deterministic functions that define the business logic—the sequence of steps, the branching conditions, and the error handling. Activities are non-deterministic functions that perform side effects—calling APIs, reading databases, sending emails, or accessing external systems.

This separation is critical. Temporal replays workflow code to recover state after failures. If workflow code is non-deterministic (e.g., it calls a random number generator or checks the current time), replay produces different results, breaking the system. By moving all side effects into activities, workflows remain deterministic while activities can do anything.

// workflows.ts - Deterministic workflow code
import { proxyActivities, sleep, condition } from '@temporalio/workflow';
import type * as activities from './activities';
 
const { chargePayment, sendEmail, updateInventory, notifyWarehouse } = 
  proxyActivities<typeof activities>({
    startToCloseTimeout: '5 minutes',
    retry: {
      maximumAttempts: 3,
      initialInterval: '1 second',
      backoffCoefficient: 2,
    },
  });
 
export async function processOrder(orderId: string): Promise<OrderResult> {
  const order = await getOrder(orderId);
  
  // Step 1: Check and reserve inventory
  const reserved = await updateInventory(order.items, 'reserve');
  if (!reserved) {
    return { status: 'failed', reason: 'out_of_stock' };
  }
  
  try {
    // Step 2: Charge payment
    const payment = await chargePayment(order.customerId, order.total);
    
    // Step 3: Wait for fraud check (async human review)
    const approved = await condition(
      () => fraudCheckComplete,
      '24 hours',
    );
    
    if (!approved) {
      // Compensate: refund payment and release inventory
      await refundPayment(payment.id);
      await updateInventory(order.items, 'release');
      return { status: 'failed', reason: 'fraud_rejected' };
    }
    
    // Step 4: Send confirmation email
    await sendEmail(order.customerId, 'order_confirmed', { orderId });
    
    // Step 5: Notify warehouse for shipping
    await notifyWarehouse(order);
    
    return { status: 'completed', orderId };
  } catch (error) {
    // Compensate on any failure
    await updateInventory(order.items, 'release');
    throw error;
  }
}
 
// activities.ts - Non-deterministic side effects
import { Context } from '@temporalio/activity';
 
export async function chargePayment(
  customerId: string, 
  amount: number
): Promise<Payment> {
  const response = await fetch('https://api.stripe.com/v1/charges', {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${process.env.STRIPE_KEY}` },
    body: JSON.stringify({ customer: customerId, amount: amount * 100 }),
  });
  
  if (!response.ok) {
    throw new Error(`Payment failed: ${response.statusText}`);
  }
  
  return response.json();
}
 
export async function sendEmail(
  userId: string, 
  template: string, 
  data: Record<string, any>
): Promise<void> {
  await emailService.send({ userId, template, data });
}
 
export async function updateInventory(
  items: OrderItem[], 
  action: 'reserve' | 'release'
): Promise<boolean> {
  for (const item of items) {
    await inventoryService.update(item.productId, action, item.quantity);
  }
  return true;
}

Workflow Execution Model

Temporal workflows execute through a replay-based model. When a workflow starts, Temporal records every decision (activity call, timer, signal) in an event history. If the workflow worker crashes, a new worker picks up the workflow and replays the event history to reconstruct the workflow state. The workflow code runs again, but instead of executing activities, it reads the results from the history.

This means workflow code must be deterministic: the same input must produce the same decisions when replayed. Non-deterministic operations (random numbers, current time, UUIDs) must be performed through Temporal APIs that record the result in the history.

Workflow execution model

Architecture and Design Patterns

Worker Setup

Workers are processes that execute workflows and activities. They poll Temporal for tasks and execute the corresponding code:

// worker.ts
import { Worker } from '@temporalio/worker';
import * as activities from './activities';
 
async function run() {
  const worker = await Worker.create({
    workflowsPath: require.resolve('./workflows'),
    activities,
    taskQueue: 'order-processing',
    maxConcurrentWorkflowTaskExecutions: 100,
    maxConcurrentActivityTaskExecutions: 50,
  });
  
  await worker.run();
}
 
run().catch((err) => {
  console.error('Worker failed:', err);
  process.exit(1);
});

Starting Workflows

Clients start workflows and interact with running workflows:

// client.ts
import { Connection, Client } from '@temporalio/client';
import { processOrder } from './workflows';
 
async function startOrderProcessing(orderId: string) {
  const connection = await Connection.connect();
  const client = new Client({ connection });
  
  const handle = await client.workflow.start(processOrder, {
    args: [orderId],
    taskQueue: 'order-processing',
    workflowId: `order-${orderId}`,
    // Workflow runs for up to 30 days
    workflowExecutionTimeout: '30 days',
  });
  
  console.log(`Started workflow: ${handle.workflowId}`);
  
  // Wait for the result
  const result = await handle.result();
  console.log('Order result:', result);
  
  return result;
}
 
// Query a running workflow
async function getOrderStatus(orderId: string) {
  const handle = client.workflow.getHandle(`order-${orderId}`);
  const status = await handle.query('getStatus');
  return status;
}
 
// Signal a running workflow
async function approveOrder(orderId: string) {
  const handle = client.workflow.getHandle(`order-${orderId}`);
  await handle.signal('approveFraudCheck');
}

Saga Pattern

The saga pattern handles distributed transactions by defining compensating actions for each step:

// workflows/saga.ts
export async function transferMoney(
  fromAccount: string,
  toAccount: string,
  amount: number
): Promise<TransferResult> {
  const saga = new Saga();
  
  try {
    // Step 1: Debit source account
    const debitTx = await debitAccount(fromAccount, amount);
    saga.addCompensation(() => creditAccount(fromAccount, amount, debitTx.id));
    
    // Step 2: Credit destination account
    const creditTx = await creditAccount(toAccount, amount);
    saga.addCompensation(() => debitAccount(toAccount, amount, creditTx.id));
    
    // Step 3: Record the transfer
    const record = await recordTransfer(fromAccount, toAccount, amount);
    saga.addCompensation(() => deleteTransfer(record.id));
    
    return { status: 'completed', transferId: record.id };
  } catch (error) {
    // Execute all compensations in reverse order
    await saga.compensate();
    return { status: 'failed', reason: error.message };
  }
}

Step-by-Step Implementation

Setting Up a Temporal Project

# Install dependencies
npm init -y
npm install @temporalio/client @temporalio/worker @temporalio/workflow @temporalio/activity
npm install typescript @types/node
 
# Initialize TypeScript
npx tsc --init

Defining a Complete Workflow

Build a user onboarding workflow that handles multi-step processes with retries:

// workflows/onboarding.ts
import { 
  proxyActivities, 
  sleep, 
  setHandler, 
  defineQuery, 
  defineSignal,
  log 
} from '@temporalio/workflow';
import type * as activities from '../activities/onboarding';
 
const {
  createUserAccount,
  sendVerificationEmail,
  setupDefaultWorkspace,
  assignDefaultRole,
  sendWelcomeEmail,
  notifyAdmin,
} = proxyActivities<typeof activities>({
  startToCloseTimeout: '10 minutes',
  retry: {
    maximumAttempts: 3,
    initialInterval: '1 second',
    backoffCoefficient: 2,
  },
});
 
interface OnboardingStatus {
  step: string;
  completed: string[];
  failed?: string;
}
 
const statusQuery = defineQuery<OnboardingStatus>('getStatus');
const cancelSignal = defineSignal('cancel');
 
export async function onboardUser(
  userId: string, 
  email: string, 
  plan: string
): Promise<OnboardingResult> {
  const status: OnboardingStatus = { 
    step: 'starting', 
    completed: [] 
  };
  
  // Expose query handler for external status checks
  setHandler(statusQuery, () => status);
  
  // Expose signal handler for cancellation
  let cancelled = false;
  setHandler(cancelSignal, () => { cancelled = true; });
  
  // Step 1: Create user account
  status.step = 'creating_account';
  const account = await createUserAccount(userId, email, plan);
  status.completed.push('account');
  
  if (cancelled) {
    log.info('Onboarding cancelled after account creation');
    return { status: 'cancelled' };
  }
  
  // Step 2: Send verification email
  status.step = 'sending_verification';
  await sendVerificationEmail(userId, email);
  status.completed.push('verification');
  
  // Step 3: Wait for email verification (up to 7 days)
  status.step = 'waiting_verification';
  const verified = await condition(
    () => emailVerified,
    '7 days',
  );
  
  if (!verified) {
    status.step = 'timed_out';
    log.warn('Email verification timed out', { userId });
    return { status: 'timed_out' };
  }
  
  // Step 4: Set up workspace
  status.step = 'setting_up_workspace';
  const workspace = await setupDefaultWorkspace(userId, plan);
  status.completed.push('workspace');
  
  // Step 5: Assign role
  status.step = 'assigning_role';
  await assignDefaultRole(userId, plan);
  status.completed.push('role');
  
  // Step 6: Send welcome email
  status.step = 'sending_welcome';
  await sendWelcomeEmail(userId, workspace.id);
  status.completed.push('welcome');
  
  // Step 7: Notify admin
  status.step = 'notifying_admin';
  await notifyAdmin(userId, email, plan);
  status.completed.push('admin_notification');
  
  status.step = 'completed';
  log.info('User onboarding completed', { userId });
  
  return { 
    status: 'completed', 
    userId, 
    workspaceId: workspace.id 
  };
}

Child Workflows

Break complex workflows into reusable child workflows:

// workflows/parent.ts
import { executeChild } from '@temporalio/workflow';
 
export async function processBatchOrders(orderIds: string[]) {
  // Process orders in parallel using child workflows
  const results = await Promise.allSettled(
    orderIds.map((orderId) =>
      executeChild(processOrder, {
        args: [orderId],
        taskQueue: 'order-processing',
        workflowId: `order-${orderId}`,
      })
    )
  );
  
  const succeeded = results.filter((r) => r.status === 'fulfilled');
  const failed = results.filter((r) => r.status === 'rejected');
  
  if (failed.length > 0) {
    // Handle partial failure
    await notifyAdmin({
      type: 'batch_partial_failure',
      total: orderIds.length,
      failed: failed.length,
      errors: failed.map((f) => f.reason),
    });
  }
  
  return {
    total: orderIds.length,
    succeeded: succeeded.length,
    failed: failed.length,
  };
}

Temporal implementation

Real-World Use Cases and Case Studies

Use Case 1: Payment Processing

Payment processing workflows handle the complex lifecycle of a payment: authorization, capture, settlement, and refund. Temporal ensures that each step completes exactly once, even if the payment gateway is temporarily unavailable. The workflow retries failed steps with exponential backoff and escalates to manual review if retries are exhausted.

Use Case 2: Order Fulfillment

E-commerce order fulfillment involves multiple services: inventory, payment, shipping, and notifications. Temporal coordinates these services as a single workflow, handling partial failures with compensating actions. If the shipping service fails after payment succeeds, the workflow retries shipping before considering a refund.

Use Case 3: Data Pipeline Orchestration

ETL pipelines with multiple stages benefit from Temporal's durability. Each stage (extract, transform, load) is an activity with retry logic. If a stage fails, Temporal retries from that stage, not from the beginning. This is more efficient than restarting the entire pipeline.

Use Case 4: User Onboarding

Multi-step onboarding flows (account creation, email verification, workspace setup, role assignment) are natural workflows. Temporal handles the asynchronous nature of email verification (waiting hours or days for the user to click a link) while maintaining the overall flow state.

Best Practices for Production

  1. Keep workflows deterministic: Never use Date.now(), Math.random(), or uuid() in workflow code. Use Temporal's sleep() for time-based logic and workflow-safe random APIs.

  2. Set appropriate timeouts: Configure startToCloseTimeout for activities based on expected execution time. Use scheduleToCloseTimeout to limit the total time including retries and scheduling delays.

  3. Use idempotent activities: Activities should be idempotent because Temporal may retry them. Use idempotency keys or database constraints to prevent duplicate side effects.

  4. Version your workflows: When you change workflow logic, use Temporal's patching API to maintain backward compatibility with running workflows. This prevents replay failures.

  5. Monitor workflow execution: Use Temporal's Web UI or Prometheus metrics to track workflow execution times, failure rates, and activity latencies. Set alerts for workflows that run longer than expected.

  6. Use task queues for isolation: Separate different types of workflows onto different task queues. This prevents a surge of one workflow type from starving workers of another type.

  7. Test workflows thoroughly: Use Temporal's test framework to run workflows in a test environment. Test failure scenarios by mocking activities to throw errors.

  8. Limit workflow history size: Long-running workflows with many activities can accumulate large histories. Use continueAsNew to start a new workflow execution with a fresh history.

Common Pitfalls and Solutions

PitfallImpactSolution
Non-deterministic code in workflowsReplay failures, data corruptionUse Temporal APIs for time/random, move side effects to activities
Missing activity timeoutsWorkflows stuck foreverAlways set startToCloseTimeout
Non-idempotent activitiesDuplicate side effects on retryUse idempotency keys or database constraints
Not versioning workflow changesReplay failures for running workflowsUse Temporal's patching API
Overly large workflow historiesPerformance degradationUse continueAsNew for long-running workflows
Wrong task queue configurationWorkflows not picked up by workersEnsure worker and workflow use the same task queue
Not handling activity failuresUnhandled exceptions crash workflowsWrap activity calls in try/catch with compensation

Performance Optimization

// Use activity batching for bulk operations
export async function processBulkOrders(orderIds: string[]) {
  // Process in batches of 10 to avoid overwhelming downstream services
  const batchSize = 10;
  const results: OrderResult[] = [];
  
  for (let i = 0; i < orderIds.length; i += batchSize) {
    const batch = orderIds.slice(i, i + batchSize);
    const batchResults = await Promise.all(
      batch.map((id) => processOrder(id))
    );
    results.push(...batchResults);
  }
  
  return results;
}
 
// Use continueAsNew for long-running workflows
export async function monitoringWorkflow(serviceId: string) {
  for (let i = 0; i < 1000; i++) {
    const health = await checkHealth(serviceId);
    if (health.status === 'unhealthy') {
      await alertOnCall(serviceId, health);
    }
    await sleep('5 minutes');
  }
  
  // Continue as new to prevent history growth
  await continueAsNew(serviceId);
}

Comparison with Alternatives

FeatureTemporalAWS Step FunctionsApache AirflowCadenceAWS SWF
LanguageGo, Java, TS, Python, PHPJSON/YAMLPythonGo, JavaJava
Workflow DefinitionCodeState MachineDAGCodeCode
DurabilityFullFullPartialFullFull
Replay-BasedYesNoNoYesYes
TestingUnit testsLimitedLimitedUnit testsLimited
Self-HostedYesNo (AWS only)YesYesNo (AWS only)
ScalabilityHighHighMediumHighHigh
CommunityLargeLargeLargeSmallLegacy
VersioningBuilt-inManualManualBuilt-inManual

Advanced Patterns and Techniques

Workflow Signals and Queries

// Define signals and queries
const updateSignal = defineSignal<[UpdateData]>('update');
const statusQuery = defineQuery<Status>('status');
const cancelSignal = defineSignal('cancel');
 
export async function longRunningWorkflow(input: Input) {
  let status: Status = { phase: 'running', progress: 0 };
  let cancelled = false;
  
  setHandler(updateSignal, (data) => {
    status.lastUpdate = data;
  });
  
  setHandler(statusQuery, () => status);
  setHandler(cancelSignal, () => { cancelled = true; });
  
  for (const task of input.tasks) {
    if (cancelled) break;
    
    status.currentTask = task.id;
    await processTask(task);
    status.progress++;
  }
  
  return { completed: !cancelled, processed: status.progress };
}
 
// Client-side interaction
const handle = await client.workflow.start(longRunningWorkflow, { ... });
 
// Query status
const status = await handle.query(statusQuery);
console.log(`Progress: ${status.progress}/${status.total}`);
 
// Send signal
await handle.signal(updateSignal, { field: 'value' });
 
// Cancel
await handle.signal(cancelSignal);

Activity Heartbeating for Long Activities

// activities/longRunning.ts
import { Context } from '@temporalio/activity';
 
export async function processLargeDataset(datasetId: string) {
  const dataset = await loadDataset(datasetId);
  const total = dataset.records.length;
  
  for (let i = 0; i < total; i++) {
    await processRecord(dataset.records[i]);
    
    // Send heartbeat every 100 records
    if (i % 100 === 0) {
      Context.current().heartbeat({ 
        progress: i, 
        total, 
        percentage: Math.round((i / total) * 100) 
      });
    }
  }
  
  return { processed: total };
}

Testing Strategies

// tests/workflows.test.ts
import { TestWorkflowEnvironment } from '@temporalio/testing';
import { processOrder } from '../workflows';
 
describe('processOrder', () => {
  let testEnv: TestWorkflowEnvironment;
 
  beforeAll(async () => {
    testEnv = await TestWorkflowEnvironment.createLocal();
  });
 
  afterAll(async () => {
    await testEnv.teardown();
  });
 
  it('processes a valid order successfully', async () => {
    const { client, nativeConnection } = testEnv;
    const worker = await Worker.create({
      connection: nativeConnection,
      taskQueue: 'test',
      activities: {
        chargePayment: async () => ({ id: 'pay_123' }),
        sendEmail: async () => {},
        updateInventory: async () => true,
        notifyWarehouse: async () => {},
      },
    });
 
    await worker.runUntil(async () => {
      const result = await client.workflow.execute(processOrder, {
        taskQueue: 'test',
        workflowId: 'test-order-1',
        args: ['order-123'],
      });
 
      expect(result.status).toBe('completed');
    });
  });
 
  it('handles payment failure with compensation', async () => {
    const { client, nativeConnection } = testEnv;
    const worker = await Worker.create({
      connection: nativeConnection,
      taskQueue: 'test',
      activities: {
        chargePayment: async () => { throw new Error('Card declined'); },
        sendEmail: async () => {},
        updateInventory: async () => true,
        notifyWarehouse: async () => {},
      },
    });
 
    await worker.runUntil(async () => {
      const result = await client.workflow.execute(processOrder, {
        taskQueue: 'test',
        workflowId: 'test-order-2',
        args: ['order-456'],
      });
 
      expect(result.status).toBe('failed');
      expect(result.reason).toBe('payment_failed');
    });
  });
});

Future Outlook

Temporal continues to grow as the leading durable workflow platform. The company recently raised $100M+ in funding and is expanding its cloud offering with improved monitoring, debugging, and deployment tools.

The TypeScript SDK is maturing rapidly, with improved type inference, better developer ergonomics, and integration with popular frameworks like NestJS and Express. The community is building reusable workflow patterns and shared activity libraries.

The broader trend toward microservices and distributed systems creates a growing need for workflow orchestration. As applications become more distributed, the complexity of coordinating services increases, and Temporal's durable execution model becomes increasingly valuable.

Conclusion

Temporal.io transforms how you build distributed systems:

  1. Workflow as code eliminates fragile orchestration: Instead of managing state machines, message queues, and retry logic, you write regular functions. Temporal handles durability, retries, and failure recovery automatically.

  2. The replay-based execution model provides strong guarantees: Workflows execute exactly once, even across server crashes and network partitions. This eliminates the need for idempotency at the workflow level.

  3. The activity model cleanly separates concerns: Deterministic workflow logic stays in workflows, while non-deterministic side effects live in activities. This separation makes code testable and maintainable.

  4. Built-in patterns for complex scenarios: Sagas, child workflows, signals, queries, and versioning are first-class features. You do not need to build these patterns yourself.

  5. The developer experience is excellent: Write workflows in your preferred language, test them with standard test frameworks, and debug them with the Temporal Web UI. The learning curve is low because the programming model is familiar.

  6. It scales from simple to complex: A basic workflow with two activities works the same as a complex workflow with hundreds of activities, child workflows, and human-in-the-loop steps. The API does not change—only the workflow code grows.

If you are building distributed systems with long-running processes, Temporal is the most robust solution available. The durability guarantees it provides eliminate an entire class of failure modes, and the programming model makes complex workflows as simple to write and maintain as regular functions.