Vercel AI SDK vs LangChain vs Raw API Calls 2026

TL;DR

Vercel AI SDK if you're building a UI with streaming. Raw API calls if you know exactly what you want. LangChain only for complex multi-step agentic workflows. In 2026, most developers are abandoning LangChain for simpler solutions — the abstraction cost is high and the framework changes constantly. For most production apps: Vercel AI SDK handles 80% of use cases with a great DX, and for the other 20% (complex agent orchestration), vendor-native SDKs or the Claude/OpenAI Agents APIs are better than LangChain.

The three options in this comparison are not equally mature or applicable. Vercel AI SDK is genuinely production-ready and actively developed by a well-resourced company. LangChain JS has a reputation for instability — major APIs change between minor versions, documentation lags the code, and the abstraction layers make debugging difficult. Raw API calls are the foundation everything else is built on and remain the most stable option. If you're starting a new project in 2026, default to Vercel AI SDK for UI work and raw API calls for everything else; reach for LangChain only when you specifically need its graph workflow features (via LangGraph).

Key Takeaways

Vercel AI SDK: best DX for streaming UIs, useChat/useCompletion hooks, works with 20+ providers, TypeScript-first
LangChain: powerful but heavy, frequently breaking changes, better alternatives exist in 2026
Raw API calls: maximum control, zero dependency, best for simple use cases or custom integrations
LangChain alternatives: LangGraph (just the graph part), Mastra, or vendor Agents SDKs (Anthropic, OpenAI)
When to use each: UI streaming → Vercel AI SDK, simple completion → raw, complex agents → vendor SDK or Mastra

The Problem With Abstraction Layers

Before comparing, understand the tradeoff:

More abstraction:
  + Less code to write
  + Handles streaming, retries, format normalization
  - Harder to debug
  - Dependency on framework churn
  - Limited access to provider-specific features
  - Version incompatibilities

Less abstraction:
  + Full control
  + Stable APIs (provider APIs change slowly)
  + Easier to debug
  - More boilerplate
  - Manual streaming handling
  - No provider switching

The question isn't "which is best" — it's where your use case sits on this spectrum.

Vercel AI SDK: The Right Level of Abstraction

Vercel AI SDK hits the sweet spot: it handles the hard parts (streaming, provider normalization, React integration) without hiding what's happening.

Core: `streamText` + React Hooks

// app/api/chat/route.ts — Server-side streaming:
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai('gpt-4o'),
    messages,
    system: 'You are a helpful assistant.',
    onFinish: async ({ text, usage }) => {
      // Called when stream completes — save to DB, log usage:
      await db.conversation.create({
        data: {
          content: text,
          inputTokens: usage.promptTokens,
          outputTokens: usage.completionTokens,
        },
      });
    },
  });

  return result.toDataStreamResponse();
}

// components/Chat.tsx — Client-side with useChat hook:
'use client';
import { useChat } from 'ai/react';

export function Chat() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
    api: '/api/chat',
    onError: (error) => console.error('Chat error:', error),
    onFinish: (message) => console.log('Done:', message),
  });

  return (
    <div className="flex flex-col h-screen">
      <div className="flex-1 overflow-y-auto p-4 space-y-4">
        {messages.map((m) => (
          <div key={m.id} className={m.role === 'user' ? 'text-right' : 'text-left'}>
            <div className={`inline-block p-3 rounded-lg max-w-[80%] ${
              m.role === 'user' ? 'bg-blue-500 text-white' : 'bg-gray-100'
            }`}>
              {m.content}
            </div>
          </div>
        ))}
        {isLoading && <div className="text-gray-400">Thinking...</div>}
      </div>

      <form onSubmit={handleSubmit} className="p-4 border-t flex gap-2">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Type a message..."
          className="flex-1 border rounded p-2"
        />
        <button type="submit" disabled={isLoading} className="px-4 py-2 bg-blue-500 text-white rounded">
          Send
        </button>
      </form>
    </div>
  );
}

Provider Switching — The Killer Feature

// Change one line to switch providers:
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
import { google } from '@ai-sdk/google';
import { groq } from '@ai-sdk/groq';
import { createOpenAI } from '@ai-sdk/openai';  // For custom endpoints

// Groq (uses createOpenAI with custom baseURL):
const groqProvider = createOpenAI({
  apiKey: process.env.GROQ_API_KEY,
  baseURL: 'https://api.groq.com/openai/v1',
});

// A/B test models:
const model = Math.random() > 0.5
  ? openai('gpt-4o')
  : anthropic('claude-3-5-sonnet-20241022');

const result = await streamText({ model, messages });

Structured Output with Zod

import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const ProductSchema = z.object({
  name: z.string(),
  price: z.number().min(0),
  category: z.enum(['electronics', 'clothing', 'food', 'other']),
  inStock: z.boolean(),
  tags: z.array(z.string()).max(5),
});

const { object } = await generateObject({
  model: openai('gpt-4o'),
  schema: ProductSchema,
  prompt: 'Extract product info from: "Nike Air Max 90 sneakers, $120, currently available"',
});
// object is fully typed as z.infer<typeof ProductSchema>
// No JSON.parse, no validation needed

Tool Calling

import { streamText, tool } from 'ai';
import { z } from 'zod';

const result = streamText({
  model: openai('gpt-4o'),
  messages,
  tools: {
    getWeather: tool({
      description: 'Get current weather for a location',
      parameters: z.object({
        location: z.string().describe('City name'),
        unit: z.enum(['celsius', 'fahrenheit']).optional(),
      }),
      execute: async ({ location, unit = 'celsius' }) => {
        // Your actual weather API call:
        const data = await fetchWeather(location, unit);
        return data;
      },
    }),
    searchDatabase: tool({
      description: 'Search the product database',
      parameters: z.object({
        query: z.string(),
        limit: z.number().default(5),
      }),
      execute: async ({ query, limit }) => {
        return await db.products.findMany({
          where: { name: { contains: query } },
          take: limit,
        });
      },
    }),
  },
  maxSteps: 5,  // Allow up to 5 tool call rounds
});

LangChain: Powerful but Complicated

LangChain (Python and JS) was the dominant LLM framework 2022-2024. In 2026, most teams have moved away from it or use only small parts.

The LangChain DX Problem

// What LangChain code often looks like:
import { ChatOpenAI } from '@langchain/openai';
import { HumanMessage, SystemMessage } from '@langchain/core/messages';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { RunnableSequence } from '@langchain/core/runnables';

const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0.7 });
const parser = new StringOutputParser();

const promptTemplate = ChatPromptTemplate.fromMessages([
  ['system', 'You are a helpful assistant.'],
  ['human', '{input}'],
]);

const chain = RunnableSequence.from([promptTemplate, model, parser]);
const result = await chain.invoke({ input: 'Hello' });

// vs Vercel AI SDK:
const { text } = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Hello',
  system: 'You are a helpful assistant.',
});

The LangChain version has 3x the imports, 3x the code, and is no more capable for this use case.

Where LangChain Still Makes Sense (LangGraph)

LangGraph (a LangChain subproject) is legitimately good for multi-agent graph workflows:

// LangGraph: complex state machines with branching, loops, parallelism
import { StateGraph, Annotation, END, START } from '@langchain/langgraph';
import { ChatOpenAI } from '@langchain/openai';

const StateAnnotation = Annotation.Root({
  messages: Annotation<string[]>({
    reducer: (x, y) => x.concat(y),
  }),
  nextAction: Annotation<string>(),
});

const graph = new StateGraph(StateAnnotation)
  .addNode('researcher', async (state) => {
    // Research agent
    const response = await researcherModel.invoke(state.messages);
    return { messages: [response.content], nextAction: 'writer' };
  })
  .addNode('writer', async (state) => {
    // Writer agent
    const response = await writerModel.invoke(state.messages);
    return { messages: [response.content], nextAction: 'END' };
  })
  .addEdge(START, 'researcher')
  .addConditionalEdges('researcher', (state) =>
    state.nextAction === 'writer' ? 'writer' : END
  )
  .addEdge('writer', END)
  .compile();

Use LangGraph when you specifically need: graph-based state machines, human-in-the-loop, branching agent logic with cycles.

Raw API Calls: Maximum Control

For simple use cases, nothing beats direct API calls.

// Pure fetch — no dependencies:
async function chatCompletion(messages: { role: string; content: string }[]) {
  const response = await fetch('https://api.openai.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'gpt-4o',
      messages,
      temperature: 0.7,
      max_tokens: 1024,
    }),
  });

  if (!response.ok) {
    throw new Error(`API error: ${response.status} ${await response.text()}`);
  }

  return response.json();
}

// Raw streaming with fetch:
async function* streamCompletion(prompt: string) {
  const response = await fetch('https://api.openai.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: prompt }],
      stream: true,
    }),
  });

  const reader = response.body!.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const text = decoder.decode(value);
    const lines = text.split('\n').filter((l) => l.startsWith('data: '));

    for (const line of lines) {
      const data = line.slice(6);
      if (data === '[DONE]') return;
      try {
        const parsed = JSON.parse(data);
        const delta = parsed.choices[0]?.delta?.content;
        if (delta) yield delta;
      } catch {}
    }
  }
}

Raw API calls make sense when:

You have one specific use case that won't change
You're writing a CLI tool or script
You want zero runtime dependencies
You're integrating with Cloudflare Workers where bundle size matters

Comparison: When to Use What

Scenario	Recommendation	Why
Chat UI in Next.js	Vercel AI SDK	`useChat` handles SSE, state, retries
Single API call in a script	Raw fetch	Zero dependencies
Switching between 3+ providers	Vercel AI SDK	One abstraction, all providers
Structured data extraction	Vercel AI SDK (`generateObject`)	Zod integration
Multi-step research agent	Mastra or Agents SDK	Better DX than LangChain
RAG pipeline	Vercel AI SDK + vector DB	`embed()` + retrieval
Complex graph workflows	LangGraph	That's what it's designed for
Cloudflare Workers edge	Raw API or Workers AI	Bundle size matters
Enterprise audit logging	Raw API	Full visibility into every request

LangChain Alternatives in 2026

If you need more than Vercel AI SDK but less than LangChain's complexity:

// Mastra — TypeScript-first agent framework:
import { Mastra, createTool } from '@mastra/core';
import { openai } from '@ai-sdk/openai';

const weatherTool = createTool({
  id: 'get-weather',
  description: 'Get weather for a location',
  inputSchema: z.object({ location: z.string() }),
  execute: async ({ context }) => {
    return await fetchWeather(context.location);
  },
});

const researchAgent = mastra.getAgent('researcher');
const result = await researchAgent.generate('What is the weather in Tokyo?', {
  tools: { weatherTool },
});

// OpenAI Agents SDK (for OpenAI-specific agent workflows):
import { Agent, Runner, handoff } from 'openai/lib/agents';

const triage = new Agent({
  name: 'Triage Agent',
  instructions: 'Route to the right specialist.',
  tools: [
    handoff({ agent: salesAgent }),
    handoff({ agent: supportAgent }),
  ],
});

const result = await Runner.run(triage, 'I want to cancel my subscription');

Observability and Debugging

How you debug AI applications differs dramatically between the three approaches.

Vercel AI SDK observability: The SDK exposes a telemetry system via the experimental_telemetry option on streamText and generateText. Pass an OpenTelemetryTracer and every LLM call becomes a span with attributes for model, token usage, latency, and completion status. This integrates with Langfuse, Helicone, and any OpenTelemetry-compatible backend. For simpler needs, the onFinish callback (shown in the example above) fires after every completion with { text, usage, finishReason } — log or save these to build your own usage dashboard. The tradeoff: observability is an opt-in feature, and teams that skip it often discover they have no visibility into AI costs until they receive a surprising invoice.

LangChain/LangGraph observability: LangChain has deep LangSmith integration — every chain, agent, and LLM call is automatically traced if you set LANGCHAIN_TRACING_V2=true. LangSmith shows the full execution tree: which chains ran, which tools were called, what the intermediate states were. For complex multi-step pipelines this is genuinely valuable. The cost: LangSmith is a paid product and tightly coupled to the LangChain ecosystem. If you migrate away from LangChain, you lose LangSmith's visibility.

Raw API calls observability: You build it yourself. The upside is that you have full control over what's logged and how. A standard pattern: wrap every LLM call in a function that logs {requestId, model, inputTokens, outputTokens, latencyMs, finishReason} as a structured JSON object. Feed this to Datadog, Grafana, or your existing logging infrastructure. This is more work to set up but gives you data in the same system as your other application metrics — no additional vendor to manage.

Migrating from LangChain

If you're on LangChain and considering migrating, the path depends on which LangChain features you're using.

For simple chains (prompt template → LLM → output parser), the migration to Vercel AI SDK is usually a few hours. The patterns map directly: ChatPromptTemplate becomes a template string, StringOutputParser becomes generateText with no additional parsing needed, and RunnableSequence becomes plain async function composition. The result is less code that's easier to read and debug.

For LangChain's retrieval chains and RAG features, migrate to Vercel AI SDK's embed() function combined with your vector database's native client. LangChain's retrievers are convenient but add a significant abstraction layer over vector database clients that are already developer-friendly (Pinecone, Qdrant, and Weaviate all have excellent TypeScript SDKs). Eliminating LangChain's retrieval abstraction often reveals performance issues hidden by the framework.

For LangGraph-specific features (state machines, loops, human-in-the-loop), the alternatives are less clear-cut. LangGraph has no direct equivalent in Vercel AI SDK. Consider: Mastra if you want a TypeScript-native framework; the Anthropic or OpenAI Agents SDKs if you're committed to a single provider; or custom orchestration code using plain async/await for simpler multi-step workflows. LangGraph's persistence features (checkpointing agent state to resume later) are genuinely useful for long-running agents and have no simple substitute.

Performance and Bundle Size

For client-side or edge deployments, bundle size is a practical concern.

LangChain JS is large — the full langchain package is over 1MB unpacked, though tree-shaking reduces this significantly if you're only using specific chains. The @langchain/core package is smaller but still requires careful tree-shaking discipline. For Cloudflare Workers (which have a 25MB bundle size limit including all dependencies), LangChain's transitive dependencies can become a real issue.

Vercel AI SDK is more modular: the ai package core is lightweight, and provider-specific packages (@ai-sdk/anthropic, @ai-sdk/openai) are imported separately. If you're only using one provider, you import only that provider's bundle. The React hooks (ai/react) are client-only code and are excluded from server bundles automatically in Next.js.

Raw API calls have zero bundle impact — you're just using fetch. For Cloudflare Workers, Deno, or other edge runtimes where bundle size is tightly constrained, raw API calls or a very thin SDK wrapper is the right choice.

Cost Management Across All Three Approaches

AI API costs are unpredictable if you're not tracking them from day one. All three approaches require different strategies for cost visibility.

Vercel AI SDK: The onFinish callback on every streamText and generateText call includes usage.promptTokens and usage.completionTokens. Log these with the model name and compute cost in your callback. Over time, these logs become your cost dashboard: which features consume the most tokens, which prompts are inefficient, which user segments drive disproportionate spend. If you're using multiple providers through the SDK, normalize to a common cost metric (e.g., dollars per 1000 tokens) for fair comparison.

LangChain: LangSmith (the managed observability platform) tracks token usage automatically when tracing is enabled. Without LangSmith, token counting requires adding callbacks to each chain. The complexity of LangChain pipelines can make cost attribution difficult — a single user action might trigger multiple LLM calls across different chains, each billed separately. If cost visibility is a priority and you're evaluating LangChain, factor in LangSmith's subscription cost ($0 for the free tier, paid plans for higher volume) as part of the total.

Raw API calls: Cost tracking is straightforward — every API response includes usage in the response body. Aggregate prompt_tokens, completion_tokens, and compute cost on every call. The challenge is attribution: mapping API costs to the features or users that caused them. Build this attribution into your logging from the start: log {featureId, userId, model, promptTokens, completionTokens, costUSD} as a structured event and you can slice costs any way you need.

Regardless of which approach you choose, set up a cost anomaly alert (spending > 2x daily average triggers a notification) from the first week. AI cost overruns are fast — a prompt injection attack, an infinite loop in agent code, or a buggy feature that calls the API on every keystroke can exhaust a month's budget in hours. Most providers offer budget alerts in their dashboards; set one at 50% and another at 90% of your monthly limit so you have time to react before hitting the hard cap.

Methodology

Vercel AI SDK v4.x is the current major version as of early 2026 and introduced the generateObject and streamObject APIs for structured outputs. The LangChain JS project (langchainjs) is separate from LangChain Python (langchain) and has historically lagged the Python version in feature completeness and stability; issues that were fixed in Python months earlier often appear in JS later. Mastra (@mastra/core) reached v1.0 in late 2025 and is the most actively maintained TypeScript-native agent framework alternative to LangChain. OpenAI's Agents SDK (the TypeScript version of the Python openai-agents library) was released in early 2026 and is primarily designed for OpenAI-native workflows; it uses the Responses API rather than the Chat Completions API. Bundle sizes cited are approximate and vary significantly based on tree-shaking configuration and which subpackages are imported.

Compare AI SDKs and frameworks at APIScout.

The API Integration Checklist (Free PDF)