AI SDK by Vercel: Building AI-Powered React Applications

Introduction

Building AI-powered user interfaces is harder than it should be. You need to handle streaming responses, manage conversation state, implement tool calling, parse structured output, manage loading states, and gracefully handle errors — all while providing a smooth user experience. The Vercel AI SDK solves these problems with a unified, framework-agnostic toolkit that makes building AI features as straightforward as building any other React component. Since its release, it has become the standard way to integrate LLMs into Next.js and React applications, with over 1.5 million weekly npm downloads and adoption by companies like Notion, Linear, and Vercel itself.

The AI SDK's key innovation is abstracting the differences between LLM providers behind a unified API. Whether you're using OpenAI, Anthropic, Google, or open-source models through Ollama, the same code works. Switching providers requires changing one line of code, not rewriting your entire integration. This provider-agnostic approach eliminates vendor lock-in and makes it easy to use the best model for each task — GPT-4o for reasoning, Claude for long context, Gemini for multimodal inputs, or Llama for local inference.

The SDK is built on three pillars: AI SDK Core (server-side functions for generating text, structured data, and tool calls), AI SDK UI (React hooks and components for building chat interfaces), and AI SDK RAG (utilities for retrieval-augmented generation). Together, these components cover the full spectrum of AI application development, from simple text generation to complex agentic workflows with streaming UI.

Understanding the AI SDK: Core Concepts

Provider Abstraction

The AI SDK uses a provider system that normalizes the differences between LLM APIs. Each provider (OpenAI, Anthropic, Google, Mistral, Cohere, Amazon Bedrock, etc.) implements a standard interface, so your application code doesn't change when you switch models or providers. Providers are installed as separate packages, keeping your bundle size minimal.

import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
import { google } from '@ai-sdk/google';
import { mistral } from '@ai-sdk/mistral';
import { ollama } from 'ollama-ai-provider';
 
// Same code works with any provider
const model = openai('gpt-4o');                        // OpenAI
const model = anthropic('claude-3-5-sonnet-20241022');  // Anthropic
const model = google('gemini-2.0-flash');              // Google
const model = mistral('mistral-large-latest');          // Mistral
const model = ollama('llama3.1');                       // Local via Ollama

You can also use custom providers or connect to any OpenAI-compatible API (like Azure OpenAI, Together AI, or Fireworks) by configuring the base URL and API key in the provider constructor.

Streaming

The AI SDK is built around streaming by default. Text generation streams tokens as they're produced using Server-Sent Events (SSE), enabling real-time UI updates. This makes AI interfaces feel responsive even for long responses — users see output immediately rather than waiting for the complete response. The streaming protocol handles backpressure, chunking, and error recovery automatically, so you never need to manage low-level streaming details.

Server Actions Integration

The SDK integrates seamlessly with Next.js Server Actions and Route Handlers. You define AI functions on the server and call them from client components using the SDK's React hooks. This keeps API keys secure on the server while providing a smooth client-side experience. The SDK also supports Edge Runtime for lower latency, and can run on any Node.js server outside of Next.js.

Type Safety

The SDK is written in TypeScript with full type inference. Tool definitions, structured output schemas, and message types are all strongly typed, catching errors at compile time rather than runtime. Zod schemas used for structured output are automatically converted to JSON Schema for the LLM and TypeScript types for your application code.

Architecture and Design Patterns

The Server Action Pattern

Define AI functions as Next.js Server Actions that return streaming responses. The client calls these actions using the SDK's hooks, which handle streaming, state management, and error handling automatically. This pattern works well with Next.js App Router and React Server Components, keeping AI logic on the server.

The Route Handler Pattern

Create dedicated API routes for AI operations. This provides more control over request/response handling and enables middleware for authentication, rate limiting, and logging. Route handlers are ideal when you need to support multiple client types (web, mobile, CLI) or when you want to expose AI functionality as a REST API.

The Hook Pattern

Use the SDK's React hooks (useChat, useCompletion, useObject) to manage AI state in client components. These hooks handle loading states, streaming updates, error handling, and message history automatically. Each hook returns a consistent interface with data, error, isLoading, and mutation functions, making them predictable and composable.

The Schema-First Pattern

Define your data schemas using Zod, and use them for both type safety in your application and structured output from the LLM. This ensures your model output always matches your application's type system. Schemas also serve as documentation for what the LLM should produce, improving output quality.

Step-by-Step Implementation

Setting Up the AI SDK

npm install ai @ai-sdk/openai

// app/api/chat/route.ts
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
 
export const maxDuration = 30;
 
export async function POST(req: Request) {
  const { messages } = await req.json();
 
  const result = streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant. Be concise and accurate.',
    messages,
  });
 
  return result.toDataStreamResponse();
}

Building a Chat Interface

The useChat hook is the primary way to build conversational UIs. It manages the full message lifecycle — appending user messages, streaming assistant responses, handling tool calls, and maintaining conversation history. The hook communicates with your API endpoint via a streaming protocol that supports text chunks, tool invocations, and structured data.

// app/chat/page.tsx
'use client';
 
import { useChat } from 'ai/react';
 
export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit, isLoading, error, stop, reload } = useChat({
    api: '/api/chat',
    onError: (err) => console.error('Chat error:', err),
    onFinish: (message) => console.log('Finished:', message),
  });
 
  return (
    <div className="flex flex-col h-screen max-w-2xl mx-auto p-4">
      <div className="flex-1 overflow-y-auto space-y-4 mb-4">
        {messages.map((message) => (
          <div
            key={message.id}
            className={`p-3 rounded-lg ${
              message.role === 'user' ? 'bg-blue-100 ml-auto' : 'bg-gray-100'
            } max-w-[80%]`}
          >
            <p className="text-sm font-semibold mb-1">
              {message.role === 'user' ? 'You' : 'AI'}
            </p>
            <div className="whitespace-pre-wrap">{message.content}</div>
          </div>
        ))}
        {isLoading && (
          <div className="bg-gray-100 p-3 rounded-lg animate-pulse">
            Thinking...
          </div>
        )}
        {error && (
          <div className="bg-red-100 p-3 rounded-lg text-red-700">
            Error: {error.message}
          </div>
        )}
      </div>
 
      <form onSubmit={handleSubmit} className="flex gap-2">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Type a message..."
          className="flex-1 p-2 border rounded-lg"
          disabled={isLoading}
        />
        <button
          type="submit"
          disabled={isLoading}
          className="px-4 py-2 bg-blue-500 text-white rounded-lg disabled:opacity-50"
        >
          Send
        </button>
      </form>
    </div>
  );
}

Implementing Tool Calling

Tool calling lets the LLM invoke client-defined functions to fetch data, perform calculations, or trigger actions. The SDK handles the full round-trip: the LLM requests a tool call, the SDK executes your function, and feeds the result back to the LLM for continued reasoning. With maxSteps, you can enable multi-step tool use where the LLM chains multiple tool calls together.

// app/api/chat-with-tools/route.ts
import { openai } from '@ai-sdk/openai';
import { streamText, tool } from 'ai';
import { z } from 'zod';
 
export async function POST(req: Request) {
  const { messages } = await req.json();
 
  const result = streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant with access to tools.',
    messages,
    tools: {
      getWeather: tool({
        description: 'Get current weather for a location',
        parameters: z.object({
          location: z.string().describe('City name'),
          units: z.enum(['celsius', 'fahrenheit']).default('celsius'),
        }),
        execute: async ({ location, units }) => {
          const response = await fetch(
            `https://api.weather.com/v1/current?q=${location}&units=${units}`
          );
          return response.json();
        },
      }),
      searchProducts: tool({
        description: 'Search the product catalog',
        parameters: z.object({
          query: z.string(),
          category: z.string().optional(),
          maxPrice: z.number().optional(),
        }),
        execute: async ({ query, category, maxPrice }) => {
          // In production, query your database
          return {
            products: [
              { name: `Result for "${query}"`, price: 29.99, inStock: true },
            ],
            total: 1,
          };
        },
      }),
    },
    maxSteps: 5, // Allow multi-step tool use
  });
 
  return result.toDataStreamResponse();
}

Structured Output with AI SDK

The generateObject function forces the LLM to produce output that exactly matches a Zod schema. This is invaluable for data extraction, classification, form filling, and any scenario where you need reliable, typed output rather than freeform text. The SDK uses constrained decoding (where supported) or prompt engineering with validation to guarantee schema compliance.

// app/api/analyze/route.ts
import { openai } from '@ai-sdk/openai';
import { generateObject } from 'ai';
import { z } from 'zod';
 
const sentimentSchema = z.object({
  sentiment: z.enum(['positive', 'negative', 'neutral', 'mixed']),
  confidence: z.number().min(0).max(1),
  keyPoints: z.array(z.object({
    text: z.string(),
    sentiment: z.enum(['positive', 'negative']),
  })),
  summary: z.string(),
});
 
export async function POST(req: Request) {
  const { text } = await req.json();
 
  const { object } = await generateObject({
    model: openai('gpt-4o'),
    schema: sentimentSchema,
    prompt: `Analyze the sentiment of this text:\n\n${text}`,
  });
 
  return Response.json(object);
}

The SDK supports multimodal inputs including images, audio, and video (depending on the provider). You can pass image URLs, base64-encoded images, or file buffers as part of the message content array. This enables building applications that can analyze screenshots, read documents, describe photos, or extract data from charts.

// app/api/vision/route.ts
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
 
export async function POST(req: Request) {
  const { messages, imageUrl } = await req.json();
 
  const result = streamText({
    model: openai('gpt-4o'),
    messages: [
      ...messages,
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Analyze this image in detail.' },
          { type: 'image', image: imageUrl },
        ],
      },
    ],
  });
 
  return result.toDataStreamResponse();
}

Embeddings for Semantic Search

The SDK provides a unified embed and embedMany function for generating vector embeddings. These are essential for building semantic search, RAG pipelines, recommendation engines, and clustering applications.

import { openai } from '@ai-sdk/openai';
import { embed, embedMany } from 'ai';
 
// Single embedding
const { embedding } = await embed({
  model: openai.embedding('text-embedding-3-small'),
  value: 'What is the Vercel AI SDK?',
});
 
// Batch embeddings (more efficient)
const { embeddings } = await embedMany({
  model: openai.embedding('text-embedding-3-small'),
  values: ['First document', 'Second document', 'Third document'],
});

Real-World Use Cases

Customer Support Chatbot

Build a customer support chatbot that uses RAG to answer questions from your knowledge base. The AI SDK handles streaming responses, tool calling for knowledge base search, and structured output for ticket creation. Use useChat with message persistence to maintain conversation history across page reloads, and implement guardrails to prevent the bot from answering off-topic questions.

Code Assistant

Create an AI code assistant that generates, reviews, and explains code. Use tool calling to execute code in a sandbox, search documentation, and interact with your codebase. The streaming UI shows code generation in real-time. Add syntax highlighting with a library like Shiki or Prism, and implement diff views to show code changes clearly.

Content Generation Platform

Build a platform for generating blog posts, social media content, and marketing copy. Use structured output to ensure generated content matches your brand guidelines, and streaming to show content as it's generated. Implement generateObject for metadata extraction (tags, summaries, SEO titles) and streamText for the main content body.

Data Analysis Dashboard

Create a natural language interface for data analysis. Users ask questions about their data, and the AI generates SQL queries, runs them, and presents results with visualizations — all streamed in real-time. Use tool calling to expose database query functions, chart generation, and data export as available actions for the LLM.

AI-Powered Search

Build semantic search that understands user intent, not just keywords. Use the SDK's embed function to vectorize your documents, store embeddings in a vector database (Pinecone, pgvector, Upstash), and use generateText with retrieved context to produce natural language answers instead of raw search results.

Best Practices for Production

Use streaming by default — Streaming provides a dramatically better user experience. Users see output immediately rather than waiting for the complete response. Time-to-first-token is the most impactful UX metric for AI applications.
Implement proper error handling — Handle network errors, rate limits, and model errors gracefully. Show user-friendly error messages and provide retry options. Use try/catch around your AI functions and implement exponential backoff for transient failures.
Keep API keys server-side — Never expose API keys in client code. Use Server Actions or Route Handlers to keep keys on the server. The SDK's architecture naturally enforces this by requiring server-side model initialization.
Use structured output for data — When you need typed data from the LLM, use generateObject with Zod schemas instead of parsing text responses. This eliminates JSON parsing errors and provides compile-time type safety.
Set appropriate timeouts — AI requests can take 10-30 seconds depending on the model and prompt complexity. Configure maxDuration on your Route Handlers and set client-side timeouts to avoid premature failures.
Implement rate limiting — Protect your API endpoints from abuse. Rate limit by user, IP, or API key to control costs. Use middleware or a service like Upstash Ratelimit for serverless-friendly rate limiting.
Cache common responses — For frequently asked questions or repeated queries, cache responses to reduce API costs and latency. Use semantic caching (matching similar queries) for maximum effectiveness.
Monitor costs and usage — Track token usage, API costs, and response times using the onFinish callback which provides token counts. Set up alerts for unusual usage patterns to catch runaway costs early.
Choose the right model for each task — Use smaller, faster models (GPT-4o-mini, Haiku) for simple tasks like classification or extraction. Reserve larger models for complex reasoning, code generation, or multi-step analysis. This can reduce costs by 80-90% without sacrificing quality.
Handle abort signals — Implement cancellation for when users navigate away or stop generation. The SDK's hooks automatically handle this via the stop function, but you should also implement server-side cleanup.

Common Pitfalls and Solutions

Pitfall	Impact	Solution
Exposing API keys in client code	Security breach, unauthorized usage	Keep keys server-side, use Server Actions
No error handling	Broken UX on API failures	Implement error boundaries and retry logic
Ignoring streaming	Poor perceived performance	Use `streamText` and `toDataStreamResponse`
No rate limiting	Cost overruns, abuse	Implement rate limiting per user/IP
Wrong model for the task	Poor quality or high cost	Match model to task complexity
No timeout handling	Hanging requests	Set appropriate maxDuration
Ignoring token limits	Truncated responses	Monitor and manage context window size
No message persistence	Lost conversations on reload	Store messages in a database
Sending full history every request	Growing costs, latency	Implement message windowing or summarization

Debugging AI SDK Issues

When issues arise, check these common sources: API key configuration, model availability, message format compatibility, and network connectivity. Enable verbose logging by passing experimental_telemetry to your AI functions to see the full request/response cycle, token usage, and latency breakdown.

Performance Optimization

Optimize AI SDK performance by choosing the right model for each task (use smaller, faster models for simple tasks), implementing response caching for repeated queries, and using streaming to reduce perceived latency. Use the Edge Runtime for lower cold-start times on Vercel, and implement connection pooling for database-backed applications.

For high-traffic applications, implement request queuing and connection pooling. Use the SDK's built-in abort functionality to cancel unnecessary requests when users navigate away. Consider implementing a token budget system that tracks usage per user and switches to cheaper models when approaching limits.

Comparison with Alternatives

Feature	Vercel AI SDK	LangChain.js	Custom Integration
Provider Abstraction	★★★★★	★★★★★	★★
React Integration	★★★★★	★★★	★★
Streaming Support	★★★★★	★★★★	★★★
Type Safety	★★★★★	★★★	★★★★
Learning Curve	Low	Medium	High
Bundle Size	Small	Large	Minimal
Best For	React/Next.js apps	Complex chains	Full control

LangChain.js offers more pre-built chains and integrations for complex AI workflows, but its larger bundle size and steeper learning curve make it less ideal for straightforward React applications. Custom integrations give maximum control but require significant boilerplate for streaming, error handling, and provider abstraction. The AI SDK strikes the best balance for most React and Next.js projects.

Advanced Patterns

Multi-Agent Orchestration

Use the AI SDK to orchestrate multiple specialized agents. Each agent handles a specific domain (search, code, analysis), and a router agent decides which to invoke based on the user's request. The maxSteps parameter in streamText enables multi-step reasoning where the LLM can call tools, evaluate results, and decide on next actions autonomously.

const result = streamText({
  model: openai('gpt-4o'),
  system: `You are a router agent. Analyze the user's request and use the appropriate tool.`,
  tools: {
    researchAgent: tool({ /* ... */ }),
    codeAgent: tool({ /* ... */ }),
    dataAgent: tool({ /* ... */ }),
  },
  maxSteps: 10, // Allow complex multi-step reasoning
  messages,
});

Real-Time Collaboration

Build collaborative AI experiences where multiple users interact with the same AI session. Use the SDK's streaming capabilities with WebSocket or Server-Sent Events for real-time updates. Combine useChat with a shared state store (like Liveblocks or PartyKit) to synchronize messages across clients.

Custom UI Components

Build custom UI components that render AI-specific content: code blocks with syntax highlighting, interactive charts from structured data, image galleries from generated images, and streaming markdown with progressive rendering. Use the SDK's message.parts array to render different content types (text, tool calls, tool results) with specialized components.

Future Outlook

The Vercel AI SDK is evolving toward full-stack AI development — covering not just chat and text generation but also embeddings, fine-tuning, evaluation, and deployment. The goal is a single toolkit that covers every aspect of building AI-powered applications. Recent additions include the ToolLoopAgent for autonomous agent workflows, improved RAG utilities, and better support for multi-modal inputs.

The most significant trend is AI-native UI patterns — interfaces designed specifically for AI interaction rather than adapting traditional UI patterns. This includes streaming-first layouts, progressive disclosure of AI reasoning, and interactive tool results that users can explore and modify. The SDK's architecture makes it straightforward to implement these patterns without fighting the framework.

Conclusion

The Vercel AI SDK is the most productive way to build AI-powered React applications. Its unified provider abstraction, streaming-first design, and deep React integration eliminate the boilerplate and complexity of LLM integration, letting you focus on building great user experiences.

Key takeaways:

The AI SDK abstracts LLM provider differences behind a unified, type-safe API
Streaming is built-in and should be used by default for better UX
Use useChat for conversational interfaces, useCompletion for text generation
Implement tool calling with the tools parameter and Zod schemas
Use generateObject for structured output with guaranteed schema compliance
Keep API keys server-side using Server Actions or Route Handlers
Match models to tasks — use smaller models for simple operations, larger for complex reasoning

Start by building a simple chat interface using useChat and a Route Handler. Once comfortable, add tool calling for interactive capabilities and structured output for data extraction. The AI SDK's incremental adoption path means you can start simple and add complexity as needed.

Minh Vo

Slaying code & making it lit fr fr 🔥 tagline

AI SDK by Vercel: Building AI-Powered React Applications

Introduction

Understanding the AI SDK: Core Concepts

Provider Abstraction

Streaming

Server Actions Integration

Type Safety

Architecture and Design Patterns

The Server Action Pattern

The Route Handler Pattern

The Hook Pattern

The Schema-First Pattern

Step-by-Step Implementation

Setting Up the AI SDK

Building a Chat Interface

Implementing Tool Calling

Structured Output with AI SDK

Embeddings for Semantic Search

Real-World Use Cases

Customer Support Chatbot

Code Assistant

Content Generation Platform

Data Analysis Dashboard

AI-Powered Search

Best Practices for Production

Common Pitfalls and Solutions

Debugging AI SDK Issues

Performance Optimization

Comparison with Alternatives

Advanced Patterns

Multi-Agent Orchestration

Real-Time Collaboration

Custom UI Components

Future Outlook

Conclusion

Minh Vo

Slaying code & making it lit fr fr 🔥 tagline

AI SDK by Vercel: Building AI-Powered React Applications

Introduction

Understanding the AI SDK: Core Concepts

Provider Abstraction

Streaming

Server Actions Integration

Type Safety

Architecture and Design Patterns

The Server Action Pattern

The Route Handler Pattern

The Hook Pattern

The Schema-First Pattern

Step-by-Step Implementation

Setting Up the AI SDK

Building a Chat Interface

Implementing Tool Calling

Structured Output with AI SDK

Multi-Modal Chat with Image Understanding

Embeddings for Semantic Search

Real-World Use Cases

Customer Support Chatbot

Code Assistant

Content Generation Platform

Data Analysis Dashboard

AI-Powered Search

Best Practices for Production

Common Pitfalls and Solutions

Debugging AI SDK Issues

Performance Optimization

Comparison with Alternatives

Advanced Patterns

Multi-Agent Orchestration

Real-Time Collaboration

Custom UI Components

Future Outlook

Conclusion