Stagehand vs Playwright: AI Browser Automation 2026

TL;DR

Use Playwright for anything deterministic, high-volume, or where you control the target site. It's free, fast, and reliable. Use Stagehand when automating sites you don't control, where selectors break, or where describing the action in English is faster than reverse-engineering the DOM. Each AI action in Stagehand costs $0.003–$0.01 and adds latency — but the auto-caching system means repeated workflows approach Playwright-native performance over time. The winning production pattern combines both: Playwright for known-selector flows, Stagehand for the ambiguous steps.

Key Takeaways

Playwright: deterministic, free, 60M+ weekly downloads, full browser control — the standard for e2e testing and structured scraping
Stagehand: AI-powered, built by Browserbase, adds act(), observe(), extract() to Playwright — now v3 with modular driver support beyond Playwright
Cost per AI action: ~$0.003–$0.01 (1–3K tokens per action with GPT-4o or Claude)
Auto-caching: Stagehand records successful selector paths and replays without LLM on repeat runs — cost drops toward zero after warmup
Stagehand v3: Removed hard Playwright dependency — now works with Puppeteer or any Chrome DevTools Protocol (CDP) driver
Browser Use: Full autonomous agent (different category) — Stagehand is AI-primitive layer, Browser Use is AI-agent layer

What Each Tool Does

Playwright

Playwright is Microsoft's browser automation library. It provides deterministic control over Chromium, Firefox, and WebKit via direct API calls:

import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();

await page.goto('https://example.com/login');
await page.fill('#email', 'user@example.com');
await page.fill('#password', 'secret123');
await page.click('button[type="submit"]');
await page.waitForURL('**/dashboard');

const title = await page.textContent('h1.dashboard-title');
console.log(title);

await browser.close();

This works perfectly — as long as #email, #password, and button[type="submit"] exist and don't change. When they do change, your automation breaks and you need to re-engineer selectors.

Stagehand

Stagehand wraps a browser driver with AI methods that interpret natural-language instructions:

import { Stagehand } from '@browserbasehq/stagehand';

const stagehand = new Stagehand({
  env: 'LOCAL',
  modelName: 'gpt-4o',
  modelClientOptions: {
    apiKey: process.env.OPENAI_API_KEY,
  },
});

await stagehand.init();
const page = stagehand.page;

await page.goto('https://example.com/login');

// AI interprets the page and finds the right elements
await stagehand.act('Fill in the email field with user@example.com');
await stagehand.act('Fill in the password field with secret123');
await stagehand.act('Click the sign in button');

await page.waitForURL('**/dashboard');
const title = await stagehand.extract('Extract the dashboard page title');

The act() call sends the page's accessibility tree + a screenshot to an LLM, which identifies the correct element and returns an action command. Stagehand executes that action with Playwright under the hood.

Core Stagehand Methods

Three primitives cover all AI automation needs:

// act(): Perform an action on the page
await stagehand.act('Click the "New Project" button');
await stagehand.act('Select "TypeScript" from the language dropdown');
await stagehand.act('Fill the description textarea with "A SaaS project management tool"');

// observe(): Analyze the page without acting — returns observations
const observations = await stagehand.observe(
  'What actions can I take on this page?'
);
// Returns: ['Click "Create Project" button', 'Navigate to Settings', ...]
// Useful for planning multi-step actions dynamically

// extract(): Extract structured data from the page
import { z } from 'zod';

const ProductSchema = z.object({
  name: z.string(),
  price: z.number(),
  inStock: z.boolean(),
  rating: z.number().optional(),
});

const product = await stagehand.extract({
  instruction: 'Extract the product details from this page',
  schema: ProductSchema,
});
// Returns: { name: 'Widget Pro', price: 49.99, inStock: true, rating: 4.2 }

The Auto-Caching System

Stagehand's caching is the feature that makes AI automation production-viable:

// First run: AI generates selector (costs LLM tokens)
await stagehand.act('Click the "Submit" button');
// → LLM call: ~1K tokens, ~5 seconds
// → Stagehand records: { action: 'click', selector: 'button#submit-order', cached: true }

// Second run: Stagehand replays cached selector (no LLM)
await stagehand.act('Click the "Submit" button');
// → Cache hit: ~50ms, $0 in AI costs

// If site updates and selector breaks:
// → Stagehand detects failure
// → Falls back to LLM to find new selector
// → Updates cache with new selector

Configure caching for your deployment:

const stagehand = new Stagehand({
  env: 'LOCAL',
  modelName: 'gpt-4o',
  // Enable persistent caching
  cacheDir: '.stagehand-cache',
  // Log cache hits/misses for cost tracking
  verbose: 2,
});

// For Browserbase cloud (remote browser)
const stagehand = new Stagehand({
  env: 'BROWSERBASE',
  browserbaseApiKey: process.env.BROWSERBASE_API_KEY,
  browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID,
  // Cache is managed server-side
});

Playwright Inside Stagehand

Stagehand exposes the underlying Playwright page — use deterministic Playwright for what you know, AI for the unknown:

import { Stagehand } from '@browserbasehq/stagehand';
import { z } from 'zod';

const stagehand = new Stagehand({ env: 'LOCAL', modelName: 'claude-3-5-sonnet-20241022' });
await stagehand.init();
const { page } = stagehand;

// Use Playwright directly for navigation (deterministic, fast)
await page.goto('https://shop.example.com');
await page.waitForLoadState('networkidle');

// Use Stagehand AI for the ambiguous "add to cart" button
// (varies by product: "Add to Cart", "Buy Now", "Get It", etc.)
await stagehand.act('Add this product to the shopping cart');

// Back to Playwright for checkout (you control this flow)
await page.goto('/cart');
await page.click('#checkout-btn');

// Stagehand to handle the payment form (many different layouts)
await stagehand.act('Fill in the credit card number 4242424242424242');
await stagehand.act('Fill in expiry 12/28 and CVV 123');
await stagehand.act('Submit the payment');

// Playwright to confirm result
await page.waitForSelector('.order-confirmation', { timeout: 10000 });
const orderId = await page.textContent('.order-id');

Stagehand with Claude

Stagehand works with multiple model providers. Claude performs well for complex element identification:

import { Stagehand } from '@browserbasehq/stagehand';

const stagehand = new Stagehand({
  env: 'LOCAL',
  modelName: 'claude-3-5-sonnet-20241022',
  modelClientOptions: {
    apiKey: process.env.ANTHROPIC_API_KEY,
  },
});

// Claude's visual understanding handles complex layouts
await stagehand.act('Find the settings for email notifications and disable weekly digests');
// Claude reads the page + screenshot and identifies the correct nested settings

Cost comparison per action:

Model              | Cost per action (est) | Speed  | Notes
gpt-4o-mini        | $0.0003–$0.001        | Fast   | Good for simple selectors
gpt-4o             | $0.003–$0.01          | Medium | Best reliability
claude-3-5-sonnet  | $0.003–$0.009         | Medium | Strong visual reasoning
claude-3-haiku     | $0.0005–$0.002        | Fast   | Budget option

Playwright: When It Wins

Playwright remains the right choice for these categories:

End-to-end test suites:

import { test, expect } from '@playwright/test';

test('user can create a project', async ({ page }) => {
  await page.goto('/dashboard');
  await page.click('[data-testid="new-project-btn"]');
  await page.fill('[data-testid="project-name"]', 'Test Project');
  await page.click('[data-testid="create-btn"]');

  await expect(page.locator('[data-testid="project-title"]'))
    .toHaveText('Test Project');
});

You own the data-testid attributes. No AI needed — Playwright handles this faster and for free.

High-volume structured scraping:

// Scraping 100K product pages — deterministic selectors, no AI needed
const products = [];
for (const url of productUrls) {
  await page.goto(url);
  products.push({
    name: await page.textContent('h1.product-title'),
    price: await page.textContent('[data-price]'),
    sku: await page.getAttribute('[data-sku]', 'data-sku'),
  });
}

At $0.003/action minimum, 100K pages × any AI actions = $300+ minimum. Playwright at $0 wins for volume.

Stagehand: When It Wins

Automating third-party sites you don't control:

// Automating a competitor's checkout flow for price monitoring
// Their selectors change weekly — AI adapts, Playwright breaks
await stagehand.act('Find the "Add to Cart" button for the Pro plan');
await stagehand.act('Complete checkout with test card 4242...');

Multi-step workflows on unfamiliar UIs:

// Onboarding automation — each SaaS has different UI patterns
const platforms = ['hubspot', 'salesforce', 'pipedrive'];

for (const platform of platforms) {
  await page.goto(`https://${platform}.com/signup`);
  await stagehand.act('Fill in the sign-up form with the company details');
  await stagehand.act('Skip any onboarding tutorials');
  await stagehand.act('Navigate to the API settings or integrations page');
  const apiKey = await stagehand.extract({
    instruction: 'Get the API key or access token',
    schema: z.object({ apiKey: z.string() }),
  });
}

Resilient long-running automations: When you need an automation to run daily for months without manual maintenance, Stagehand's self-healing selector cache handles site updates automatically.

Browser Use: A Different Category

Browser Use is sometimes compared to Stagehand, but they're different:

Stagehand:     AI primitives layer — you write the workflow, AI handles element identification
Browser Use:   Full AI agent — the AI plans AND executes the entire workflow

Use Stagehand:  When you have a defined workflow and need reliable execution
Use Browser Use: When you describe the goal and want AI to figure out the steps

// Stagehand — you control the flow
await page.goto('https://app.example.com/reports');
await stagehand.act('Select last 30 days from the date filter');
await stagehand.act('Click Export CSV');

// Browser Use — AI controls the flow
const agent = new Agent({
  task: 'Export the last 30 days of sales data from app.example.com as CSV',
  llm: new ChatOpenAI({ modelName: 'gpt-4o' }),
});
await agent.run(); // AI navigates, finds reports, applies filter, exports

Cost Calculator

Estimating Stagehand costs for your use case:

Actions per workflow:  5
Runs per day:          100
Cache hit rate:        70% (after first week of warmup)
Model:                 gpt-4o ($0.005/action average)

Daily cost:
  Uncached actions: 100 × 5 × 0.30 = 150 actions × $0.005 = $0.75/day
  Total monthly: ~$22.50

For 1,000 runs/day with same workflow:
  Uncached: 1,500 × $0.005 = $7.50/day
  Monthly: ~$225

Cache warmup dramatically changes the economics — first week is most expensive, then costs plateau.

Comparison Table

Feature	Playwright	Stagehand
Price	Free (open source)	AI costs per action
Speed per action	~50ms	1–5 seconds (uncached) / 50ms (cached)
Selector resilience	Breaks on site changes	Self-healing via AI + cache
Natural language	❌	✅ (`act()`, `observe()`, `extract()`)
Structured extraction	Manual selector logic	`extract()` with Zod schema
Test framework	Built-in (Playwright Test)	No built-in test framework
Visual/screenshot reasoning	❌	✅ (sends screenshots to LLM)
Self-healing	❌	✅ (auto-caching)
Computer use	❌	Can integrate with OpenAI Computer Use
Cloud browsers	Playwright cloud offerings	Browserbase native
Learning curve	Medium	Low (English instructions)
v3 driver	Chromium/Firefox/WebKit	Any CDP driver

Debugging and Observability

Stagehand's AI actions are harder to debug than Playwright's deterministic selectors. When stagehand.act('Click the submit button') fails, the error doesn't tell you which element the AI tried to click or why it couldn't find the right element. Two debugging approaches:

The verbose: 2 log level outputs the full prompt sent to the LLM, the LLM's response (which element it identified and why), and the resulting Playwright action. Reading these logs makes it clear whether the AI correctly identified the right element but Playwright failed to click it (timing issue), or whether the AI selected the wrong element entirely (prompt ambiguity or page structure issue).

Stagehand's observe() method is also useful for debugging: call it before act() to see what the AI thinks are the available actions on the current page. If the AI doesn't list the action you want, your act() prompt needs revision or the page structure is confusing the visual reasoning.

For CI/CD integration, Playwright's built-in HTML report and trace viewer work with Stagehand since Stagehand uses Playwright under the hood. Enable tracing in your config to capture a full video, screenshots, and network log of every test run — when an AI action fails in CI, the trace file shows exactly what the page looked like when the action was attempted.

CI/CD Integration

Stagehand adds latency to test runs that Playwright-only suites don't have. An uncached AI action takes 1-5 seconds; a full test suite with 50 AI actions could take 5+ minutes even before considering browser startup time. Two strategies for managing this in CI:

Warm the cache before CI runs. Run your Stagehand suite locally first to populate the selector cache, commit the cache directory (.stagehand-cache), and check it into your repository. CI reads from the committed cache, getting near-Playwright speeds for all cached actions. Only new or updated selectors pay the AI cost in CI.

Separate AI-backed tests from fast tests. Treat Stagehand tests as a separate test tier — slower, higher-level, run less frequently. Run Playwright unit tests on every PR; run Stagehand integration tests nightly or pre-release. This mirrors the testing pyramid for AI-backed automation: fast/deterministic tests run frequently, slow/AI tests run as a final gate.

Methodology

Stagehand version referenced: v3 (stable as of March 2026), which introduced the modular driver architecture removing the hard Playwright dependency. Playwright version: 1.44. Cost-per-action figures are estimates based on measured token counts (1,000-3,000 input tokens per action with a screenshot) and OpenAI GPT-4o/Anthropic pricing as of March 2026. Cache hit rate estimates based on Browserbase's published benchmarks for typical e-commerce automation workflows. Selector caching behavior and performance figures verified against the Stagehand GitHub repository's benchmark suite. Browser Use version referenced: 0.1.x (pre-1.0 as of writing). AI cost-per-action figures represent estimates for the typical page complexity of a modern SaaS application; simpler pages (fewer DOM elements, cleaner accessibility tree) produce shorter prompts and lower per-action costs, while complex pages with many interactive elements can exceed these estimates.

Browse all browser automation and scraping APIs at APIScout.

The API Integration Checklist (Free PDF)