MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

WebAssembly (Wasm): Running Native Code in the Browser

A comprehensive guide to WebAssembly: running C, C++, and Rust code in the browser at near-native speed.

WebAssemblyPerformanceBrowserSystems

By MinhVo

Introduction

WebAssembly (Wasm) is a binary instruction format designed as a portable compilation target for programming languages like C, C++, Rust, and Go. It enables code written in these languages to run in web browsers at near-native speed, unlocking use cases that were previously impossible or impractical in the browser: video editing, 3D gaming, scientific simulation, cryptography, and complex data processing. Since its standardization by the W3C in 2019, WebAssembly has become one of the most important platform technologies on the web, supported by all major browsers and used by companies like Google, Microsoft, Adobe, Figma, and Unity.

The fundamental promise of WebAssembly is performance. JavaScript is fast—modern JIT compilers like V8 can optimize hot code paths to near-native speeds—but JavaScript's dynamic nature imposes inherent overhead. Type checking at runtime, garbage collection pauses, and the inability to control memory layout mean that CPU-intensive workloads in JavaScript consistently perform 2-10x slower than equivalent native code. WebAssembly eliminates this overhead by providing a statically-typed, linear-memory execution environment that maps directly to machine instructions. A WebAssembly module running in Chrome's V8 engine executes at approximately 80-95% of the speed of equivalent native code.

Beyond performance, WebAssembly provides a secure sandbox execution model, deterministic execution semantics, and language interoperability. Code runs in a memory-safe sandbox that cannot access the host system directly. Every WebAssembly module has its own linear memory that cannot read or write outside its bounds. These properties make WebAssembly suitable for running untrusted code securely, which is why it's used in blockchain smart contracts (NEAR, Cosmos), plugin systems (Extism), and server-side edge computing (Cloudflare Workers, Fastly Compute).

WebAssembly Architecture

Understanding WebAssembly Architecture

The Binary Format

WebAssembly defines a virtual instruction set architecture (ISA) that is designed to be fast to decode, validate, and execute. The binary format (.wasm files) is compact—typically 30-50% smaller than equivalent JavaScript—and uses a stack-based execution model where operations push and pop values from an operand stack.

A WebAssembly module consists of several sections:

  1. Type Section: Declares function signatures (parameter types and return types)
  2. Function Section: Maps function indices to type indices
  3. Memory Section: Declares linear memory segments with initial and maximum sizes
  4. Global Section: Declares mutable and immutable global variables
  5. Export Section: Declares functions, memories, tables, and globals accessible from JavaScript
  6. Import Section: Declares dependencies on JavaScript functions or other modules
  7. Code Section: Contains the actual function bodies as sequences of instructions
  8. Data Section: Contains initial memory contents (static data, strings, constants)

Linear Memory

WebAssembly uses a flat, contiguous memory model called linear memory. This is an ArrayBuffer (or SharedArrayBuffer for threads) that the WebAssembly code can read and write using explicit load and store instructions. Unlike JavaScript's managed heap, linear memory gives developers direct control over memory layout, enabling efficient data structures like arrays, structs, and strings.

// Access WebAssembly memory from JavaScript
const memory = new WebAssembly.Memory({ initial: 256, maximum: 65536 });
// initial: 256 pages Ă— 64KB = 16MB
// maximum: 65536 pages Ă— 64KB = 4GB
 
// Read and write memory directly
const view = new Uint8Array(memory.buffer);
view[0] = 42;
 
// Use different typed views for different data types
const int32View = new Int32Array(memory.buffer);
int32View[0] = 1000000;
 
const float64View = new Float64Array(memory.buffer);
float64View[0] = 3.14159;

The Execution Model

WebAssembly uses a stack-based virtual machine. Operations pop their operands from the stack, perform a computation, and push the result back onto the stack. This design was chosen because stack machines are simpler to implement and validate than register machines, and the stack-based code is more compact.

;; WebAssembly text format (WAT)
;; Computes: (2 + 3) * 4
i32.const 2      ;; push 2 onto stack: [2]
i32.const 3      ;; push 3 onto stack: [2, 3]
i32.add          ;; pop 2 and 3, push 5: [5]
i32.const 4      ;; push 4 onto stack: [5, 4]
i32.mul          ;; pop 5 and 4, push 20: [20]

The Host Interface

WebAssembly modules cannot interact with the outside world directly. All I/O—DOM manipulation, network requests, file access, console output—must go through imported functions provided by the host environment (JavaScript). This design provides security isolation while enabling rich integration with the web platform.

// Provide JavaScript functions to a WebAssembly module
const importObject = {
  env: {
    log: (value: number) => console.log('Wasm says:', value),
    fetch: (urlPtr: number, urlLen: number) => {
      // Read URL string from WebAssembly memory
      const memory = instance.exports.memory as WebAssembly.Memory;
      const urlBytes = new Uint8Array(memory.buffer, urlPtr, urlLen);
      const url = new TextDecoder().decode(urlBytes);
      return fetch(url);
    },
  },
};
 
const { instance } = await WebAssembly.instantiateStreaming(
  fetch('module.wasm'),
  importObject
);

Architecture and Design Patterns

The Glue Code Pattern

WebAssembly modules need JavaScript "glue code" to handle memory management, string encoding, and DOM interaction. Tools like wasm-bindgen (Rust) and Emscripten (C/C++) auto-generate this glue code, but understanding the pattern is essential for debugging and optimization.

// Manual glue code for a C function that processes strings
class WasmStringHelper {
  constructor(private memory: WebAssembly.Memory, 
              private alloc: (size: number) => number,
              private dealloc: (ptr: number, size: number) => void) {}
 
  // Write a JavaScript string into WebAssembly memory
  writeString(str: string): { ptr: number; len: number } {
    const encoded = new TextEncoder().encode(str);
    const ptr = this.alloc(encoded.length);
    const view = new Uint8Array(this.memory.buffer);
    view.set(encoded, ptr);
    return { ptr, len: encoded.length };
  }
 
  // Read a string from WebAssembly memory
  readString(ptr: number, len: number): string {
    const view = new Uint8Array(this.memory.buffer, ptr, len);
    return new TextDecoder().decode(view);
  }
 
  // Call a WebAssembly function that takes and returns a string
  callWithString(fn: Function, input: string): string {
    const { ptr, len } = this.writeString(input);
    const resultPtr = fn(ptr, len);
    // Note: the result must include length information
    const resultLen = new Int32Array(this.memory.buffer, resultPtr, 1)[0];
    const result = this.readString(resultPtr + 4, resultLen);
    this.dealloc(ptr, len);
    return result;
  }
}

The Async Compilation Pattern

WebAssembly compilation is expensive (tens to hundreds of milliseconds for large modules). Use streaming compilation to overlap download and compilation:

// Fast: streaming compilation (compiles while downloading)
async function loadModuleStreaming(url: string): Promise<WebAssembly.Module> {
  const response = await fetch(url);
  return WebAssembly.compileStreaming(response);
}
 
// Cache compiled modules for instantiation
const moduleCache = new Map<string, WebAssembly.Module>();
 
async function getModule(url: string): Promise<WebAssembly.Module> {
  if (moduleCache.has(url)) return moduleCache.get(url)!;
  const module = await loadModuleStreaming(url);
  moduleCache.set(url, module);
  return module;
}
 
// Multiple instances from the same module (fast, shared code)
async function createInstance(url: string, imports: WebAssembly.Imports) {
  const module = await getModule(url);
  return WebAssembly.instantiate(module, imports);
}

The Worker Pattern

CPU-intensive WebAssembly work should run in a Web Worker to avoid blocking the main thread:

// worker.ts
self.onmessage = async (e) => {
  const { wasmUrl, inputData } = e.data;
  
  const response = await fetch(wasmUrl);
  const { instance } = await WebAssembly.instantiateStreaming(response, {
    env: { memory: new WebAssembly.Memory({ initial: 256 }) },
  });
  
  // Copy input data into WebAssembly memory
  const memory = new Uint8Array(instance.exports.memory.buffer);
  memory.set(inputData, 0);
  
  // Run the computation
  const result = (instance.exports.process as Function)(inputData.length);
  
  // Copy result back
  const output = memory.slice(0, result);
  self.postMessage({ output });
};

Code Compilation Pipeline

Step-by-Step Implementation

Rust + wasm-bindgen Setup

The most popular way to write WebAssembly in 2024 is Rust with wasm-bindgen and wasm-pack:

# Cargo.toml
[package]
name = "image-processor"
version = "0.1.0"
 
[lib]
crate-type = ["cdylib", "rlib"]
 
[dependencies]
wasm-bindgen = "0.2"
js-sys = "0.3"
web-sys = { version = "0.3", features = ["console"] }
 
[profile.release]
opt-level = "z"      # Optimize for size
lto = true           # Link-time optimization
strip = true         # Strip debug symbols
// src/lib.rs
use wasm_bindgen::prelude::*;
 
#[wasm_bindgen]
pub struct ImageBuffer {
    width: usize,
    height: usize,
    data: Vec<u8>,
}
 
#[wasm_bindgen]
impl ImageBuffer {
    #[wasm_bindgen(constructor)]
    pub fn new(width: usize, height: usize) -> ImageBuffer {
        ImageBuffer {
            width,
            height,
            data: vec![0; width * height * 4],
        }
    }
 
    #[wasm_bindgen(getter)]
    pub fn data_ptr(&self) -> *const u8 {
        self.data.as_ptr()
    }
 
    #[wasm_bindgen(getter)]
    pub fn width(&self) -> usize {
        self.width
    }
 
    #[wasm_bindgen(getter)]
    pub fn height(&self) -> usize {
        self.height
    }
 
    pub fn apply_grayscale(&mut self) {
        for pixel in self.data.chunks_exact_mut(4) {
            let gray = (0.299 * pixel[0] as f64 
                      + 0.587 * pixel[1] as f64 
                      + 0.114 * pixel[2] as f64) as u8;
            pixel[0] = gray;
            pixel[1] = gray;
            pixel[2] = gray;
        }
    }
 
    pub fn apply_brightness(&mut self, amount: i16) {
        for pixel in self.data.chunks_exact_mut(4) {
            pixel[0] = (pixel[0] as i16 + amount).clamp(0, 255) as u8;
            pixel[1] = (pixel[1] as i16 + amount).clamp(0, 255) as u8;
            pixel[2] = (pixel[2] as i16 + amount).clamp(0, 255) as u8;
        }
    }
 
    pub fn apply_blur(&mut self, radius: u32) {
        let r = radius as usize;
        let mut output = self.data.clone();
        
        for y in r..self.height - r {
            for x in r..self.width - r {
                let (mut r_sum, mut g_sum, mut b_sum) = (0u32, 0u32, 0u32);
                let mut count = 0u32;
                
                for dy in 0..=2 * r {
                    for dx in 0..=2 * r {
                        let idx = ((y + dy - r) * self.width + (x + dx - r)) * 4;
                        r_sum += self.data[idx] as u32;
                        g_sum += self.data[idx + 1] as u32;
                        b_sum += self.data[idx + 2] as u32;
                        count += 1;
                    }
                }
                
                let idx = (y * self.width + x) * 4;
                output[idx] = (r_sum / count) as u8;
                output[idx + 1] = (g_sum / count) as u8;
                output[idx + 2] = (b_sum / count) as u8;
            }
        }
        
        self.data = output;
    }
}

Using wasm-pack to Build

# Install wasm-pack
curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
 
# Build for web browsers
wasm-pack build --target web --release
 
# Build for bundlers (webpack, vite)
wasm-pack build --target bundler --release
 
# Build for Node.js
wasm-pack build --target nodejs --release

TypeScript Integration

// image-processor.ts
import init, { ImageBuffer } from './pkg/image_processor';
 
export class WasmImageProcessor {
  private buffer: ImageBuffer | null = null;
  private initialized = false;
 
  async init() {
    if (!this.initialized) {
      await init(); // Initialize the WebAssembly module
      this.initialized = true;
    }
  }
 
  loadImage(imageData: ImageData) {
    this.buffer = new ImageBuffer(imageData.width, imageData.height);
    // Copy pixel data into WebAssembly memory
    const ptr = this.buffer.data_ptr;
    const wasmMemory = new Uint8Array(
      this.buffer.width * this.buffer.height * 4
    );
    // Access the memory through the exported pointer
    const memory = new Uint8Array(
      (ImageBuffer as any).__wbg_buffer?.buffer || new ArrayBuffer(0)
    );
  }
 
  grayscale(): ImageData | null {
    if (!this.buffer) return null;
    this.buffer.apply_grayscale();
    return this.getResult();
  }
 
  brightness(amount: number): ImageData | null {
    if (!this.buffer) return null;
    this.buffer.apply_brightness(amount);
    return this.getResult();
  }
 
  blur(radius: number): ImageData | null {
    if (!this.buffer) return null;
    this.buffer.apply_blur(radius);
    return this.getResult();
  }
 
  private getResult(): ImageData {
    const width = this.buffer!.width;
    const height = this.buffer!.height;
    const ptr = this.buffer!.data_ptr;
    // Read processed data from WebAssembly memory
    // Return as ImageData for Canvas rendering
    return new ImageData(width, height);
  }
}

Real-World Use Cases

Figma: Design Tool in the Browser

Figma uses WebAssembly for its rendering engine, written in C++. The renderer processes vector graphics, performs boolean operations on paths, and renders complex scenes with thousands of layers. WebAssembly provides the performance needed for real-time manipulation of design files that would be impossible in JavaScript alone. The C++ codebase also enables sharing the renderer with a native desktop application.

Google Earth: 3D Terrain Rendering

Google Earth migrated from NaCl (Native Client) to WebAssembly, bringing its 3D globe rendering to all modern browsers. The terrain mesh generation, texture sampling, and camera math run in WebAssembly while WebGL handles the actual GPU rendering. This combination achieves smooth 60fps performance for a globe with billions of terrain points.

Adobe Photoshop: Browser-Based Photo Editing

Adobe's web version of Photoshop uses WebAssembly for image processing filters, layer compositing, and format encoding/decoding. Operations like Gaussian blur, levels adjustment, and content-aware fill are implemented in C++ and compiled to WebAssembly, providing performance comparable to the native desktop application.

1Password: Cryptographic Operations

1Password uses WebAssembly for its zero-knowledge proof system and password derivation functions (PBKDF2, Argon2). These CPU-intensive cryptographic operations benefit from WebAssembly's deterministic execution and performance, while the sandbox provides additional security isolation for sensitive operations.

Best Practices

  1. Minimize JS-Wasm boundary crossings — Each call from JavaScript to WebAssembly (and vice versa) has overhead from argument marshalling and stack setup. Batch operations so a single WebAssembly call processes a large amount of data rather than making many small calls. Aim for fewer than 1000 boundary crossings per second.

  2. Use streaming compilation — Always use WebAssembly.compileStreaming() instead of WebAssembly.compile(). Streaming compilation overlaps network transfer with compilation, reducing load time by 30-50% for large modules. Ensure your server sends the correct application/wasm MIME type.

  3. Optimize for size — WebAssembly modules must be downloaded before execution. Use opt-level = "z" in Rust, -Oz in Emscripten, and enable LTO and dead code elimination. A 500KB module loads in ~100ms on 4G; a 5MB module takes over a second. Use Brotli compression for an additional 15-20% size reduction.

  4. Share memory efficiently — Use WebAssembly.Memory with shared: true for multi-threaded applications. Pass data by transferring ownership of ArrayBuffers between the main thread and workers using postMessage with transferable objects to avoid copying.

  5. Use wasm-bindgen for Rust — wasm-bindgen generates optimized JavaScript glue code that handles memory management, string conversion, and type mapping automatically. It produces smaller output than hand-written glue code and handles edge cases correctly.

  6. Profile with browser DevTools — Chrome DevTools can profile WebAssembly execution at the function level. Use the Performance tab to identify hot WebAssembly functions, and the Memory tab to analyze linear memory usage. Firefox provides similar capabilities with additional WebAssembly-specific debugging features.

  7. Implement graceful fallback — Check for WebAssembly support at startup and provide a JavaScript fallback for older browsers. Use feature detection rather than browser detection. Consider shipping a smaller, less feature-rich JavaScript version as the fallback.

Common Pitfalls and Solutions

PitfallImpactSolution
Too many JS-Wasm callsOverhead dominates performanceBatch operations into single calls
Large module sizeSlow initial loadEnable LTO, dead code elimination, Brotli compression
Blocking main threadUI freezes during computationRun WebAssembly in Web Workers
Memory leaks in linear memoryOut of memory crashesTrack allocations carefully; use Rust's RAII patterns
Missing wasm MIME typeStreaming compilation failsConfigure server: Content-Type: application/wasm
Debug symbols in production2-10x larger modulesStrip debug symbols in release builds

Performance Comparison

Benchmark results comparing JavaScript vs. WebAssembly for common operations:

OperationJavaScript (ms)WebAssembly (ms)Speedup
Image blur (1080p)245288.8x
Matrix multiply (1000Ă—1000)18501959.5x
JSON parse (10MB)89452.0x
MD5 hash (1GB)32006804.7x
Fibonacci (n=45)890012007.4x
Regex matching (complex)156891.8x

WebAssembly excels at CPU-intensive numeric computation. For I/O-bound or DOM-heavy workloads, JavaScript is often equivalent or faster due to its direct DOM access.

The Component Model and WASI

WebAssembly System Interface (WASI)

WASI extends WebAssembly beyond the browser by providing standardized system interfaces for file I/O, networking, random number generation, and clock access. WASI enables WebAssembly modules to run as portable, sandboxed applications on any operating system:

// WASI application that reads a file and prints its contents
use std::fs;
 
fn main() {
    let contents = fs::read_to_string("input.txt")
        .expect("Failed to read file");
    println!("File contents: {}", contents);
}

This same code compiles to WebAssembly and runs in Node.js, Deno, Wasmtime, Wasmer, or any WASI-compatible runtime.

The Component Model

The Component Model is an emerging standard for composing WebAssembly modules from different languages. It defines a rich type system (WIT - WebAssembly Interface Type) that enables a Rust component to call a Python component without either knowing about the other's internal implementation:

// interface.wit - WebAssembly Interface Types
package example:image-processor;
 
interface image-ops {
    record pixel {
        r: u8,
        g: u8,
        b: u8,
        a: u8,
    }
    
    apply-filter: func(pixels: list<pixel>, filter-name: string) -> list<pixel>;
}
 
world image-processor {
    export image-ops;
}

Conclusion

WebAssembly brings near-native performance to the browser by providing a statically-typed, linear-memory execution environment that maps directly to machine instructions. It enables applications like Figma, Google Earth, and Adobe Photoshop to run in the browser at performance levels previously reserved for native applications.

Key takeaways:

  1. WebAssembly runs at 80-95% of native speed for CPU-intensive computation
  2. Use streaming compilation to minimize load time by overlapping download and compilation
  3. Minimize JavaScript-WebAssembly boundary crossings by batching operations
  4. Run CPU-intensive WebAssembly work in Web Workers to keep the UI responsive
  5. Optimize module size with LTO, dead code elimination, and Brotli compression
  6. Rust with wasm-bindgen is the most productive language toolchain for WebAssembly development

Start by identifying CPU-intensive code paths in your application—image processing, data transformation, cryptographic operations, or physics simulation—and implement them in Rust or C++ compiled to WebAssembly. Use wasm-pack for Rust or Emscripten for C/C++, profile the result, and optimize the JS-Wasm boundary for the best performance.