Introduction
The Web Codecs API provides low-level access to browser media encoders and decoders, enabling frame-by-frame processing of audio and video entirely within the browser. Unlike the <video> element which abstracts away all encoding and decoding internals, Web Codecs hands you raw pixel data and audio samples to manipulate, transform, or transmit however you see fit. This is the API that powers browser-based video editors, real-time streaming platforms, computer vision pipelines, and custom media processing workflows that previously required native code or server-side infrastructure.
Before Web Codecs, developers who needed granular control over media processing had to rely on workarounds like decoding video by drawing frames to a canvas and reading pixels back, or sending video to a server for processing. These approaches were slow, inefficient, and limited. The Web Codecs API changes the equation entirely by exposing the same hardware-accelerated encoders and decoders that the browser uses internally for <video> and <audio> playback, giving web applications native-level media processing performance.
This guide covers every aspect of the Web Codecs API â from the core architecture and individual class interfaces to real-world implementation patterns for video transcoding, screen recording, real-time effects processing, and WebRTC integration. You will also learn about codec selection strategies, memory management best practices, container format handling, and browser compatibility considerations.
Architecture Overview
The Web Codecs API is built around a clean pipeline architecture that separates concerns between raw media data, encoded (compressed) media data, and the encoders/decoders that transform between them.
The Processing Pipeline
Every Web Codecs application follows the same fundamental pattern: source â decode â process â encode â output. The source can be a camera feed via getUserMedia, a video file read through a demuxing library, or frames generated programmatically. The output can be a rendered canvas, a recorded file, a live stream, or anything else that consumes encoded or raw media data.
ââââââââââââââ ââââââââââââââââ ââââââââââââââ ââââââââââââââââ
â Media âââââ>â Video/Audio âââââ>â Process âââââ>â Video/Audio âââ> Output
â Source â â Decoder â â (optional) â â Encoder â
ââââââââââââââ ââââââââââââââââ ââââââââââââââ ââââââââââââââ
Core Class Hierarchy
The API introduces eight primary classes organized into two parallel hierarchies â one for video and one for audio:
| Class | Purpose | Data Type |
|---|---|---|
VideoFrame | Raw video frame with pixel data and metadata | Pixels in GPU memory |
EncodedVideoChunk | Compressed video frame data | Binary buffer |
VideoDecoder | Transforms EncodedVideoChunk â VideoFrame | â |
VideoEncoder | Transforms VideoFrame â EncodedVideoChunk | â |
AudioData | Raw audio samples (PCM) | Float32/Int16 arrays |
EncodedAudioChunk | Compressed audio data | Binary buffer |
AudioDecoder | Transforms EncodedAudioChunk â AudioData | â |
AudioEncoder | Transforms AudioData â EncodedAudioChunk | â |
There is typically a 1:1 correspondence between raw and encoded representations. Decoding N encoded chunks yields exactly N raw frames or audio data objects.
Asynchronous Processing Model
Each encoder and decoder maintains an internal processing queue. Methods like configure(), encode(), decode(), and flush() are asynchronous â they append control messages to the queue and return immediately. The actual work happens in the background, potentially on a dedicated hardware thread. Methods named reset() and close() are synchronous: reset() aborts pending work and allows reconfiguration, while close() permanently shuts down the instance and releases all resources.
// The processing model in action
encoder.configure({ codec: 'avc1.64001E', width: 1920, height: 1080, bitrate: 5_000_000 });
encoder.encode(frame1); // Queued â returns immediately
encoder.encode(frame2); // Queued â returns immediately
await encoder.flush(); // Waits for both frames to be encodedUnderstanding this queue-based model is critical for managing backpressure. If you queue frames faster than the encoder can process them, the internal queue grows unbounded. Production applications must implement flow control, typically by checking encoder.encodeQueueSize before encoding more frames.
Supported Codecs
The Web Codecs API supports a carefully curated set of industry-standard codecs. However, actual availability depends on the browser and underlying hardware. Always verify codec support at runtime using the isConfigSupported() static methods.
Video Codecs
H.264 (AVC) is the most universally supported video codec. Nearly every device with hardware video acceleration can encode and decode H.264. Codec strings follow the pattern avc1.{profile}{level}, such as avc1.64001E for High Profile Level 3.0 or avc1.4d001f for Main Profile Level 3.1.
VP9 is an open-source codec developed by Google that offers better compression than H.264 at equivalent quality. It is widely used on YouTube and in WebM containers. Codec strings use the pattern vp09.{profile}.{level}.{bitDepth}.{chromaSubsampling}, such as vp09.00.40.08 for Profile 0, Level 4.0, 8-bit.
AV1 is the newest open-source codec, offering 30-50% better compression than H.264 and 20-30% better than VP9. Hardware decoder support is broad, but hardware encoder support is still limited to newer GPUs. Codec strings follow av01.{profile}.{level}.{tier}.{bitDepth}, such as av01.0.08M.08 for Main Profile, Main Tier, 8-bit.
H.265 (HEVC) offers better compression than H.264 but has limited browser support outside Apple's Safari and WebKit-based browsers due to patent licensing concerns. Codec strings use hev1.{profile}.{level} or hvc1.{profile}.{level}.
Audio Codecs
Opus is the recommended codec for most Web Codecs audio use cases. It provides excellent quality at low bitrates with very low latency, making it ideal for real-time communication and streaming.
AAC (Advanced Audio Coding) is widely supported and commonly found in MP4 containers. The codec string mp4a.40.2 refers to AAC-LC (Low Complexity).
PCM (Pulse Code Modulation) represents uncompressed audio with no quality loss but very large file sizes. Useful as an intermediate format during processing.
FLAC (Free Lossless Audio Codec) provides lossless compression. Useful for archival quality audio processing.
Checking Codec Support at Runtime
Never assume a codec is available. Always verify before creating encoders or decoders:
async function checkVideoCodecSupport(): Promise<Map<string, boolean>> {
const codecs = [
'avc1.64001E', // H.264 High Profile
'vp09.00.10.08', // VP9 Profile 0
'av01.0.04M.08', // AV1 Main Profile
];
const support = new Map<string, boolean>();
for (const codec of codecs) {
try {
const result = await VideoDecoder.isConfigSupported({
codec,
codedWidth: 1920,
codedHeight: 1080,
});
support.set(codec, result.supported);
} catch {
support.set(codec, false);
}
}
return support;
}
async function checkAudioCodecSupport(): Promise<Map<string, boolean>> {
const codecs = ['opus', 'mp4a.40.2', 'flac'];
const support = new Map<string, boolean>();
for (const codec of codecs) {
try {
const result = await AudioEncoder.isConfigSupported({
codec,
sampleRate: 48000,
numberOfChannels: 1,
bitrate: 128_000,
});
support.set(codec, result.supported);
} catch {
support.set(codec, false);
}
}
return support;
}Video Decoding in Depth
The VideoDecoder transforms compressed video chunks into raw pixel data that you can render, analyze, or process. Understanding the decoder's behavior is essential for building reliable media applications.
Configuring the Decoder
The decoder must be configured before it can accept input. The configuration object requires at minimum a codec string and the coded dimensions. Optional parameters include description data (required for some codecs like VP9 and AV1, typically extracted from the container's codec-specific header), display dimensions, and color space information.
const decoder = new VideoDecoder({
output: (frame: VideoFrame) => {
// Each decoded frame arrives here asynchronously
console.log(`Decoded frame at ${frame.timestamp}Ξs`);
console.log(` Dimensions: ${frame.displayWidth}Ã${frame.displayHeight}`);
console.log(` Format: ${frame.format}`); // e.g., "I420", "NV12", "RGBA"
console.log(` Duration: ${frame.duration}Ξs`);
// Render to canvas
const canvas = document.querySelector('canvas')!;
const ctx = canvas.getContext('2d')!;
canvas.width = frame.displayWidth;
canvas.height = frame.displayHeight;
ctx.drawImage(frame, 0, 0);
// CRITICAL: Release GPU memory held by this frame
frame.close();
},
error: (e: DOMException) => {
console.error('Decoder error:', e.message);
// Common errors: NotSupportedError, DataError, InvalidStateError
},
});
decoder.configure({
codec: 'avc1.64001E',
codedWidth: 1920,
codedHeight: 1080,
// Optional: hardwareAcceleration: 'prefer-hardware',
});Feeding Encoded Data to the Decoder
Once configured, you create EncodedVideoChunk objects and pass them to the decoder's decode() method. Each chunk must specify whether it is a key frame (type: 'key') or a delta frame (type: 'delta'), along with a timestamp in microseconds and the raw encoded data.
// When reading from a demuxed file:
for (const packet of demuxedPackets) {
const chunk = new EncodedVideoChunk({
type: packet.isKeyFrame ? 'key' : 'delta',
timestamp: packet.timestamp, // In microseconds
duration: packet.duration,
data: packet.data,
});
decoder.decode(chunk);
}
// Important: flush to ensure all frames are output
await decoder.flush();
decoder.close();The decoder requires a key frame as the first chunk after configuration or after a flush. Delta frames reference the previous frame's data and cannot be decoded independently. If the decoder receives a delta frame before a key frame, it will throw a DataError.
Video Encoding in Depth
The VideoEncoder transforms raw video frames into compressed chunks suitable for storage or transmission. Encoder configuration has a significant impact on output quality, file size, and encoding speed.
Encoder Configuration Strategies
// Configuration for high-quality recording (offline encoding)
const recordingConfig: VideoEncoderConfig = {
codec: 'avc1.64001E',
width: 1920,
height: 1080,
bitrate: 8_000_000, // 8 Mbps for high quality
framerate: 30,
latencyMode: 'quality', // Optimize for quality over latency
// avc: { format: 'avc' }, // For raw AVC access
};
// Configuration for real-time streaming (low latency)
const streamingConfig: VideoEncoderConfig = {
codec: 'avc1.42001f', // Constrained Baseline for max compatibility
width: 1280,
height: 720,
bitrate: 2_500_000, // 2.5 Mbps for 720p
framerate: 30,
latencyMode: 'realtime', // Optimize for speed over quality
bitrateMode: 'variable', // VBR for better quality in static scenes
};
// Configuration for adaptive bitrate streaming
const abrConfig: VideoEncoderConfig = {
codec: 'vp09.00.10.08',
width: 1920,
height: 1080,
bitrate: 5_000_000,
framerate: 30,
scalabilityMode: 'L1T3', // Temporal scalability for ABR
};Encoding Frames with Key Frame Control
const encoder = new VideoEncoder({
output: (chunk: EncodedVideoChunk, metadata?: EncodedVideoChunkMetadata) => {
const data = new Uint8Array(chunk.byteLength);
chunk.copyTo(data);
// The metadata may contain codec-specific data (e.g., SPS/PPS for H.264)
if (metadata?.decoderConfig) {
console.log('Decoder config updated:', metadata.decoderConfig);
// Store or transmit the decoder config alongside the chunk
}
// Transmit or store the encoded data
muxer.addChunk(chunk, data);
},
error: (e: DOMException) => {
console.error('Encoder error:', e.message);
},
});
encoder.configure(recordingConfig);
let frameCount = 0;
const keyFrameInterval = 150; // Force key frame every 5 seconds at 30fps
function encodeVideoFrame(source: CanvasImageSource, timestamp: number) {
const frame = new VideoFrame(source, { timestamp });
// Force key frames at regular intervals for seeking support
const isKeyFrame = frameCount % keyFrameInterval === 0;
encoder.encode(frame, { keyFrame: isKeyFrame });
frame.close(); // Release the source frame immediately
frameCount++;
// Monitor queue pressure
if (encoder.encodeQueueSize > 10) {
console.warn('Encoder queue backing up, consider throttling input');
}
}Monitoring and Managing the Encode Queue
The encodeQueueSize property tells you how many frames are waiting to be processed. If this number grows, your pipeline is producing frames faster than the encoder can consume them:
function shouldEncodeMore(): boolean {
const queueSize = encoder.encodeQueueSize;
if (queueSize > 5) return false; // Backpressure â stop feeding frames
if (queueSize > 2) return true; // Catching up
return true; // Normal â keep going
}Audio Encoding and Decoding
The audio interfaces mirror the video ones but work with PCM sample data instead of pixel data.
Audio Encoding with Opus
const audioEncoder = new AudioEncoder({
output: (chunk: EncodedAudioChunk, metadata?: EncodedAudioChunkMetadata) => {
const data = new Uint8Array(chunk.byteLength);
chunk.copyTo(data);
// Send to muxer or network
audioMuxer.write(chunk, data);
},
error: (e: DOMException) => console.error('Audio encoder error:', e),
});
audioEncoder.configure({
codec: 'opus',
sampleRate: 48000,
numberOfChannels: 1,
bitrate: 128_000, // 128 kbps for speech, 256k for music
// opus: { complexity: 10 }, // Max quality (0-10)
});
// Encode audio from microphone capture
async function encodeMicrophoneAudio() {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const audioContext = new AudioContext({ sampleRate: 48000 });
const source = audioContext.createMediaStreamSource(stream);
const processor = audioContext.createScriptProcessor(4096, 1, 1);
source.connect(processor);
processor.connect(audioContext.destination);
let timestamp = 0;
const frameDuration = 4096 / 48000; // ~85ms per chunk
processor.onaudioprocess = (event) => {
const inputBuffer = event.inputBuffer;
const samples = inputBuffer.getChannelData(0);
const audioData = new AudioData({
format: 'f32',
sampleRate: 48000,
numberOfFrames: inputBuffer.length,
numberOfChannels: 1,
timestamp: timestamp * 1_000_000, // Convert to microseconds
data: samples,
});
audioEncoder.encode(audioData);
audioData.close();
timestamp += frameDuration;
};
}Audio Decoding and Playback
const audioDecoder = new AudioDecoder({
output: (audioData: AudioData) => {
// Extract raw PCM samples
const samples = new Float32Array(audioData.numberOfFrames);
audioData.copyTo(samples, { planeIndex: 0 });
// Play through Web Audio API
const audioCtx = new AudioContext({ sampleRate: audioData.sampleRate });
const buffer = audioCtx.createBuffer(
audioData.numberOfChannels,
audioData.numberOfFrames,
audioData.sampleRate
);
for (let ch = 0; ch < audioData.numberOfChannels; ch++) {
const channelData = new Float32Array(audioData.numberOfFrames);
audioData.copyTo(channelData, { planeIndex: ch, format: 'f32' });
buffer.copyToChannel(channelData, ch);
}
const bufferSource = audioCtx.createBufferSource();
bufferSource.buffer = buffer;
bufferSource.connect(audioCtx.destination);
bufferSource.start();
audioData.close();
},
error: (e: DOMException) => console.error('Audio decoder error:', e),
});
audioDecoder.configure({
codec: 'opus',
sampleRate: 48000,
numberOfChannels: 1,
});Muxing and Demuxing: Working with Container Formats
The Web Codecs API only handles encoding and decoding â it has no concept of container formats like MP4, WebM, or MKV. To read encoded chunks from a video file, you need a demuxing library. To write encoded chunks to a playable file, you need a muxing library.
Demuxing MP4 Files for Decoding
import { Mp4Demuxer } from './mp4-demuxer'; // Using a demuxing library
async function decodeVideoFile(file: File) {
const buffer = await file.arrayBuffer();
const demuxer = new Mp4Demuxer(buffer);
const decoder = new VideoDecoder({
output: (frame) => {
renderFrameToCanvas(frame);
frame.close();
},
error: console.error,
});
// Configure decoder from the demuxer's track info
const videoTrack = demuxer.getVideoTrack();
decoder.configure({
codec: videoTrack.codec,
codedWidth: videoTrack.width,
codedHeight: videoTrack.height,
description: videoTrack.description, // Codec-specific init data
});
// Decode all chunks from the file
for (const chunk of demuxer.getChunks()) {
decoder.decode(chunk);
}
await decoder.flush();
decoder.close();
}Muxing Encoded Chunks into WebM
import { WebMMuxer } from 'webm-muxer';
async function encodeAndMuxToWebM(
canvas: HTMLCanvasElement,
durationSeconds: number
) {
const muxer = new WebMMuxer({
target: 'buffer',
video: { codec: 'V_VP9', width: canvas.width, height: canvas.height },
});
const encoder = new VideoEncoder({
output: (chunk, metadata) => {
muxer.addVideoChunk(chunk, metadata);
},
error: console.error,
});
encoder.configure({
codec: 'vp09.00.10.08',
width: canvas.width,
height: canvas.height,
bitrate: 5_000_000,
framerate: 30,
});
// Encode frames from canvas
for (let i = 0; i < durationSeconds * 30; i++) {
const timestamp = (i / 30) * 1_000_000; // Microseconds
const frame = new VideoFrame(canvas, { timestamp });
encoder.encode(frame, { keyFrame: i % 90 === 0 });
frame.close();
}
await encoder.flush();
encoder.close();
const buffer = muxer.finalize();
// buffer now contains a valid WebM file
downloadBlob(new Blob([buffer], { type: 'video/webm' }), 'output.webm');
}Real-World Use Case: Screen Recording Application
A complete screen recording application demonstrates how the Web Codecs APIs work together. This example captures the screen, encodes video with H.264, encodes system audio with Opus, and produces a downloadable WebM file.
class ScreenRecorder {
private videoEncoder: VideoEncoder;
private audioEncoder: AudioEncoder;
private muxer: any; // WebM or MP4 muxer
private isRecording = false;
private frameCount = 0;
constructor(private outputCanvas?: HTMLCanvasElement) {
this.setupEncoder();
}
private setupEncoder() {
this.videoEncoder = new VideoEncoder({
output: (chunk, meta) => this.muxer.addVideoChunk(chunk, meta),
error: (e) => console.error('Video encoder error:', e),
});
this.audioEncoder = new AudioEncoder({
output: (chunk, meta) => this.muxer.addAudioChunk(chunk, meta),
error: (e) => console.error('Audio encoder error:', e),
});
}
async start() {
const stream = await navigator.mediaDevices.getDisplayMedia({
video: { width: 1920, height: 1080, frameRate: 30 },
audio: {
echoCancellation: false,
noiseSuppression: false,
sampleRate: 48000,
},
});
// Configure encoders
this.videoEncoder.configure({
codec: 'avc1.64001E',
width: 1920,
height: 1080,
bitrate: 8_000_000,
framerate: 30,
latencyMode: 'realtime',
});
const audioTracks = stream.getAudioTracks();
if (audioTracks.length > 0) {
this.audioEncoder.configure({
codec: 'opus',
sampleRate: 48000,
numberOfChannels: 1,
bitrate: 128_000,
});
}
this.isRecording = true;
this.frameCount = 0;
// Process video frames using MediaStreamTrackProcessor
const videoTrack = stream.getVideoTracks()[0];
const processor = new MediaStreamTrackProcessor({ track: videoTrack });
const reader = processor.readable.getReader();
// Process audio if available
if (audioTracks.length > 0) {
this.processAudio(audioTracks[0]);
}
// Video processing loop
while (this.isRecording) {
const { done, value } = await reader.read();
if (done) break;
const frame = new VideoFrame(value, {
timestamp: performance.now() * 1000,
});
// Optional: render to canvas for preview or processing
if (this.outputCanvas) {
const ctx = this.outputCanvas.getContext('2d')!;
ctx.drawImage(frame, 0, 0);
}
const isKeyFrame = this.frameCount % 90 === 0; // Every 3 seconds
this.videoEncoder.encode(frame, { keyFrame: isKeyFrame });
frame.close();
value.close();
this.frameCount++;
}
await this.videoEncoder.flush();
await this.audioEncoder.flush();
this.videoEncoder.close();
this.audioEncoder.close();
stream.getTracks().forEach(t => t.stop());
}
stop() {
this.isRecording = false;
}
}Real-Time Video Effects Pipeline
Combining Web Codecs with Canvas and WebGL enables real-time video effects processing entirely in the browser. This pattern is useful for video conferencing filters, AR overlays, and live production tools.
class VideoEffectsProcessor {
private gl: WebGLRenderingContext;
private program: WebGLProgram;
constructor(private canvas: HTMLCanvasElement) {
this.gl = canvas.getContext('webgl')!;
this.setupShaders();
}
private setupShaders() {
// Vertex shader: pass-through
const vsSource = `
attribute vec2 a_position;
attribute vec2 a_texCoord;
varying vec2 v_texCoord;
void main() {
gl_Position = vec4(a_position, 0.0, 1.0);
v_texCoord = a_texCoord;
}
`;
// Fragment shader: color manipulation effect
const fsSource = `
precision mediump float;
varying vec2 v_texCoord;
uniform sampler2D u_image;
uniform float u_time;
void main() {
vec4 color = texture2D(u_image, v_texCoord);
// Apply a subtle color grading effect
color.r = pow(color.r, 0.9);
color.b = pow(color.b, 1.1);
// Add a vignette
float dist = distance(v_texCoord, vec2(0.5, 0.5));
color.rgb *= smoothstep(0.8, 0.3, dist);
gl_FragColor = color;
}
`;
// Compile and link shader program...
}
async processStream(stream: MediaStream) {
const track = stream.getVideoTracks()[0];
const processor = new MediaStreamTrackProcessor({ track });
const reader = processor.readable.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const frame = new VideoFrame(value, {
timestamp: performance.now() * 1000,
});
// Upload frame texture to WebGL
this.gl.texImage2D(
this.gl.TEXTURE_2D, 0, this.gl.RGBA,
this.gl.RGBA, this.gl.UNSIGNED_BYTE, frame
);
// Draw with effect shader
this.gl.drawArrays(this.gl.TRIANGLE_STRIP, 0, 4);
// Create output frame from canvas
const outputFrame = new VideoFrame(this.canvas, {
timestamp: frame.timestamp,
});
// Pass to encoder or display
frame.close();
value.close();
}
}
}Worker-Based Architecture for Production Applications
Production Web Codecs applications should run encoding and decoding in Web Workers to avoid blocking the main thread. The VideoFrame and AudioData objects support transferable semantics, allowing zero-copy handoff between threads.
Dedicated Worker for Video Processing
// video-processor.worker.ts
import { VideoEncoder, VideoDecoder, VideoFrame } from 'webcodecs';
class VideoProcessorWorker {
private decoder: VideoDecoder;
private encoder: VideoEncoder;
constructor() {
this.decoder = new VideoDecoder({
output: (frame) => this.processDecodedFrame(frame),
error: (e) => self.postMessage({ type: 'error', message: e.message }),
});
this.encoder = new VideoEncoder({
output: (chunk, meta) => {
const data = new Uint8Array(chunk.byteLength);
chunk.copyTo(data);
self.postMessage(
{ type: 'encoded', data, timestamp: chunk.timestamp, metadata: meta },
[data.buffer] // Transfer ownership â zero copy
);
},
error: (e) => self.postMessage({ type: 'error', message: e.message }),
});
}
private async processDecodedFrame(frame: VideoFrame) {
// Apply transformations, filters, or analysis
// Then re-encode if needed
this.encoder.encode(frame);
frame.close();
}
configure(config: { decoder: VideoDecoderConfig; encoder: VideoEncoderConfig }) {
this.decoder.configure(config.decoder);
this.encoder.configure(config.encoder);
}
decode(data: ArrayBuffer, timestamp: number, type: 'key' | 'delta') {
const chunk = new EncodedVideoChunk({
type,
timestamp,
data,
});
this.decoder.decode(chunk);
}
}
const processor = new VideoProcessorWorker();
self.onmessage = (event) => {
const { action, payload } = event.data;
switch (action) {
case 'configure':
processor.configure(payload);
break;
case 'decode':
processor.decode(payload.data, payload.timestamp, payload.type);
break;
}
};Transferable Objects for Zero-Copy Communication
When passing VideoFrame objects between threads, use the transfer list to avoid expensive copies of GPU-backed pixel data:
// Main thread â Worker: transfer the frame
const frame = new VideoFrame(videoElement, { timestamp: performance.now() * 1000 });
worker.postMessage({ frame, timestamp: frame.timestamp }, [frame]);
// frame is now neutered on the main thread â cannot be used here
// Worker receives the frame without copying
worker.onmessage = (event) => {
const { frame } = event.data;
// frame is valid and usable in the worker
processFrame(frame);
};Video Transcoding Pipeline
Building a browser-based video transcoder demonstrates the full power of Web Codecs. This pattern reads a file, decodes it, optionally applies transformations, and re-encodes in a different format or quality.
class BrowserTranscoder {
async transcode(inputFile: File, outputConfig: VideoEncoderConfig) {
const buffer = await inputFile.arrayBuffer();
const demuxer = new Mp4Demuxer(buffer);
const muxer = new WebMMuxer({
target: 'buffer',
video: { codec: 'V_VP9', width: outputConfig.width!, height: outputConfig.height! },
});
const encoder = new VideoEncoder({
output: (chunk, meta) => muxer.addVideoChunk(chunk, meta),
error: console.error,
});
encoder.configure(outputConfig);
const decoder = new VideoDecoder({
output: (frame) => {
// Optional: resize, crop, or apply effects here
encoder.encode(frame);
frame.close();
},
error: console.error,
});
const videoTrack = demuxer.getVideoTrack();
decoder.configure({
codec: videoTrack.codec,
codedWidth: videoTrack.width,
codedHeight: videoTrack.height,
description: videoTrack.description,
});
for (const chunk of demuxer.getChunks()) {
decoder.decode(chunk);
}
await decoder.flush();
await encoder.flush();
decoder.close();
encoder.close();
return muxer.finalize(); // Returns ArrayBuffer containing WebM data
}
}This transcoding pipeline runs entirely in the browser with no server round-trips. For large files, integrate it with the worker architecture above to keep the UI responsive.
WebRTC Integration
The Web Codecs API integrates naturally with WebRTC for real-time communication, giving you fine-grained control over encoding parameters that the standard WebRTC API does not expose.
// Use WebCodecs encoder for custom WebRTC sender
async function setupCustomWebRTCSender(pc: RTCPeerConnection, stream: MediaStream) {
const videoTrack = stream.getVideoTracks()[0];
const processor = new MediaStreamTrackProcessor({ track: videoTrack });
const reader = processor.readable.getReader();
const encoder = new VideoEncoder({
output: (chunk, metadata) => {
// Create an EncodedVideoChunk and send via RTCRtpSender
const sender = pc.getSenders().find(s => s.track === videoTrack);
if (sender) {
// Use insertable streams or encoded transform
// for direct encoded frame injection
}
},
error: console.error,
});
encoder.configure({
codec: 'avc1.42001f', // Baseline for max WebRTC compatibility
width: 640,
height: 480,
bitrate: 1_000_000,
framerate: 30,
latencyMode: 'realtime',
});
// Adaptive bitrate based on network conditions
function adjustBitrate(availableBandwidth: number) {
encoder.configure({
codec: 'avc1.42001f',
width: 640,
height: 480,
bitrate: Math.min(availableBandwidth * 0.8, 2_500_000),
framerate: 30,
latencyMode: 'realtime',
});
}
}Memory Management Best Practices
The Web Codecs API deals with large binary buffers and GPU-backed frame data. Improper memory management leads to memory leaks that degrade performance and can crash the application.
The Golden Rule: Always Close Resources
Every VideoFrame, AudioData, EncodedVideoChunk, and EncodedAudioChunk holds a reference to an underlying buffer. These resources must be explicitly released:
// WRONG: Memory leak â frames accumulate
const frames: VideoFrame[] = [];
decoder.configure({ codec: 'avc1.64001E', codedWidth: 1920, codedHeight: 1080 });
// ... decode many frames into array without closing
// CORRECT: Close each frame after processing
const decoder = new VideoDecoder({
output: (frame: VideoFrame) => {
try {
processFrame(frame);
} finally {
frame.close(); // Always close, even if processing fails
}
},
error: console.error,
});Handling Backpressure
When processing cannot keep up with the input rate, implement flow control:
class BackpressureAwareDecoder {
private decoder: VideoDecoder;
private pendingFrames = 0;
private maxPending = 3;
constructor() {
this.decoder = new VideoDecoder({
output: (frame) => {
this.pendingFrames--;
this.processAndClose(frame);
},
error: console.error,
});
}
async decode(chunk: EncodedVideoChunk) {
// Wait if too many frames are pending
while (this.pendingFrames >= this.maxPending) {
await new Promise(resolve => requestAnimationFrame(resolve));
}
this.pendingFrames++;
this.decoder.decode(chunk);
}
private processAndClose(frame: VideoFrame) {
try {
// Process the frame...
} finally {
frame.close();
}
}
}Browser Compatibility
The Web Codecs API has broad support across modern browsers, though audio support lags behind video support on some platforms:
| Feature | Chrome | Edge | Firefox | Safari |
|---|---|---|---|---|
VideoDecoder | 94+ | 94+ | 113+ | 16.4+ |
VideoEncoder | 94+ | 94+ | 113+ | 16.4+ |
AudioDecoder | 94+ | 94+ | 113+ | â |
AudioEncoder | 94+ | 94+ | 113+ | â |
VideoFrame | 94+ | 94+ | 113+ | 16.4+ |
AudioData | 94+ | 94+ | 113+ | â |
| Hardware acceleration | â | â | Partial | â |
Feature Detection and Fallbacks
function isWebCodecsSupported(): boolean {
return (
typeof VideoEncoder !== 'undefined' &&
typeof VideoDecoder !== 'undefined' &&
typeof VideoFrame !== 'undefined'
);
}
function getMediaProcessingStrategy() {
if (isWebCodecsSupported()) {
return 'webcodecs'; // Best: frame-level control with hardware acceleration
}
if (typeof MediaRecorder !== 'undefined') {
return 'mediarecorder'; // Good: simple API but less control
}
if (typeof MediaSource !== 'undefined') {
return 'mediasource'; // Fallback: MSE for streaming
}
return 'server'; // Last resort: send to server for processing
}Performance Considerations
Hardware vs. Software Encoding
Browsers automatically choose between hardware (GPU) and software (CPU) encoding based on the configuration and available hardware. Hardware encoding is significantly faster but may produce slightly lower quality at the same bitrate. Force hardware preference when latency matters:
encoder.configure({
codec: 'avc1.64001E',
width: 1920,
height: 1080,
bitrate: 5_000_000,
hardwareAcceleration: 'prefer-hardware', // 'prefer-software' | 'no-preference'
});OffscreenCanvas for Worker-Based Processing
Move expensive frame processing off the main thread using OffscreenCanvas in a Web Worker:
// In a Web Worker
self.onmessage = async (event) => {
const { frame, width, height } = event.data;
const canvas = new OffscreenCanvas(width, height);
const ctx = canvas.getContext('2d')!;
ctx.drawImage(frame, 0, 0);
// Apply pixel-level processing
const imageData = ctx.getImageData(0, 0, width, height);
const pixels = imageData.data;
// Example: Convert to grayscale
for (let i = 0; i < pixels.length; i += 4) {
const gray = pixels[i] * 0.299 + pixels[i+1] * 0.587 + pixels[i+2] * 0.114;
pixels[i] = pixels[i+1] = pixels[i+2] = gray;
}
ctx.putImageData(imageData, 0, 0);
// Return processed frame
const outputFrame = new VideoFrame(canvas, { timestamp: frame.timestamp });
frame.close();
self.postMessage({ frame: outputFrame }, [outputFrame]);
};Common Pitfalls and Solutions
| Pitfall | Impact | Solution |
|---|---|---|
Not calling frame.close() | GPU memory leak, eventual crash | Always close in a finally block |
| Missing key frames at start | Decoder error, no output | Send key frame first after configure() |
| Using unsupported codec string | NotSupportedError | Call isConfigSupported() first |
| Queue overflow | Frames dropped, OOM | Monitor encodeQueueSize, implement backpressure |
| Wrong timestamp units | Audio/video desync | Web Codecs uses microseconds, not milliseconds |
Ignoring decoderConfig in metadata | Cannot reconstruct file | Store decoderConfig with encoded chunks |
Calling flush() too frequently | Degrades encode quality | Only flush at natural boundaries |
| Mixing pixel formats | Garbled output | Ensure VideoFrame.format matches encoder expectations |
Conclusion
The Web Codecs API is a transformative technology that brings native-grade media processing capabilities to the browser. By exposing hardware-accelerated encoders and decoders at the frame level, it enables a new class of web applications â from browser-based video editors and real-time streaming tools to AI-powered computer vision pipelines and custom transcoding services.
Key takeaways from this guide:
- Frame-level control â Process individual video frames and audio samples with pixel-perfect precision, enabling effects, analysis, and transformations that were previously impossible in the browser.
- Hardware acceleration â The API leverages GPU encoders and decoders automatically, delivering performance comparable to native applications without requiring plugins or server-side processing.
- Codec flexibility â Support for H.264, VP9, AV1, Opus, AAC, and more gives you the freedom to choose the right codec for your use case, whether prioritizing compatibility, quality, or compression efficiency.
- Memory discipline is mandatory â Every
VideoFrameandAudioDatamust be explicitly closed. Build cleanup into your processing pipelines from the start, usingtry/finallypatterns and backpressure management. - Container formats require external libraries â Web Codecs handles encoding and decoding only. For reading and writing MP4 or WebM files, use dedicated muxing and demuxing libraries like Mediabunny or webm-muxer.
- Feature detection is essential â Browser support varies, especially for audio codecs. Always check
isConfigSupported()and provide graceful fallbacks for unsupported environments.
The ecosystem around Web Codecs continues to mature rapidly, with growing library support for container formats, improving hardware encoder availability for AV1, and expanding Safari support for audio interfaces. For any application that needs low-level media processing in the browser, Web Codecs is the foundation to build on.