Stagehand vs Playwright: AI Browser Automation 2026
TL;DR
Use Playwright for anything deterministic, high-volume, or where you control the target site. It's free, fast, and reliable. Use Stagehand when automating sites you don't control, where selectors break, or where describing the action in English is faster than reverse-engineering the DOM. Each AI action in Stagehand costs $0.003–$0.01 and adds latency — but the auto-caching system means repeated workflows approach Playwright-native performance over time. The winning production pattern combines both: Playwright for known-selector flows, Stagehand for the ambiguous steps.
Key Takeaways
- Playwright: deterministic, free, 60M+ weekly downloads, full browser control — the standard for e2e testing and structured scraping
- Stagehand: AI-powered, built by Browserbase, adds
act(),observe(),extract()to Playwright — now v3 with modular driver support beyond Playwright - Cost per AI action: ~$0.003–$0.01 (1–3K tokens per action with GPT-4o or Claude)
- Auto-caching: Stagehand records successful selector paths and replays without LLM on repeat runs — cost drops toward zero after warmup
- Stagehand v3: Removed hard Playwright dependency — now works with Puppeteer or any Chrome DevTools Protocol (CDP) driver
- Browser Use: Full autonomous agent (different category) — Stagehand is AI-primitive layer, Browser Use is AI-agent layer
What Each Tool Does
Playwright
Playwright is Microsoft's browser automation library. It provides deterministic control over Chromium, Firefox, and WebKit via direct API calls:
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto('https://example.com/login');
await page.fill('#email', 'user@example.com');
await page.fill('#password', 'secret123');
await page.click('button[type="submit"]');
await page.waitForURL('**/dashboard');
const title = await page.textContent('h1.dashboard-title');
console.log(title);
await browser.close();
This works perfectly — as long as #email, #password, and button[type="submit"] exist and don't change. When they do change, your automation breaks and you need to re-engineer selectors.
Stagehand
Stagehand wraps a browser driver with AI methods that interpret natural-language instructions:
import { Stagehand } from '@browserbasehq/stagehand';
const stagehand = new Stagehand({
env: 'LOCAL',
modelName: 'gpt-4o',
modelClientOptions: {
apiKey: process.env.OPENAI_API_KEY,
},
});
await stagehand.init();
const page = stagehand.page;
await page.goto('https://example.com/login');
// AI interprets the page and finds the right elements
await stagehand.act('Fill in the email field with user@example.com');
await stagehand.act('Fill in the password field with secret123');
await stagehand.act('Click the sign in button');
await page.waitForURL('**/dashboard');
const title = await stagehand.extract('Extract the dashboard page title');
The act() call sends the page's accessibility tree + a screenshot to an LLM, which identifies the correct element and returns an action command. Stagehand executes that action with Playwright under the hood.
Core Stagehand Methods
Three primitives cover all AI automation needs:
// act(): Perform an action on the page
await stagehand.act('Click the "New Project" button');
await stagehand.act('Select "TypeScript" from the language dropdown');
await stagehand.act('Fill the description textarea with "A SaaS project management tool"');
// observe(): Analyze the page without acting — returns observations
const observations = await stagehand.observe(
'What actions can I take on this page?'
);
// Returns: ['Click "Create Project" button', 'Navigate to Settings', ...]
// Useful for planning multi-step actions dynamically
// extract(): Extract structured data from the page
import { z } from 'zod';
const ProductSchema = z.object({
name: z.string(),
price: z.number(),
inStock: z.boolean(),
rating: z.number().optional(),
});
const product = await stagehand.extract({
instruction: 'Extract the product details from this page',
schema: ProductSchema,
});
// Returns: { name: 'Widget Pro', price: 49.99, inStock: true, rating: 4.2 }
The Auto-Caching System
Stagehand's caching is the feature that makes AI automation production-viable:
// First run: AI generates selector (costs LLM tokens)
await stagehand.act('Click the "Submit" button');
// → LLM call: ~1K tokens, ~5 seconds
// → Stagehand records: { action: 'click', selector: 'button#submit-order', cached: true }
// Second run: Stagehand replays cached selector (no LLM)
await stagehand.act('Click the "Submit" button');
// → Cache hit: ~50ms, $0 in AI costs
// If site updates and selector breaks:
// → Stagehand detects failure
// → Falls back to LLM to find new selector
// → Updates cache with new selector
Configure caching for your deployment:
const stagehand = new Stagehand({
env: 'LOCAL',
modelName: 'gpt-4o',
// Enable persistent caching
cacheDir: '.stagehand-cache',
// Log cache hits/misses for cost tracking
verbose: 2,
});
// For Browserbase cloud (remote browser)
const stagehand = new Stagehand({
env: 'BROWSERBASE',
browserbaseApiKey: process.env.BROWSERBASE_API_KEY,
browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID,
// Cache is managed server-side
});
Playwright Inside Stagehand
Stagehand exposes the underlying Playwright page — use deterministic Playwright for what you know, AI for the unknown:
import { Stagehand } from '@browserbasehq/stagehand';
import { z } from 'zod';
const stagehand = new Stagehand({ env: 'LOCAL', modelName: 'claude-3-5-sonnet-20241022' });
await stagehand.init();
const { page } = stagehand;
// Use Playwright directly for navigation (deterministic, fast)
await page.goto('https://shop.example.com');
await page.waitForLoadState('networkidle');
// Use Stagehand AI for the ambiguous "add to cart" button
// (varies by product: "Add to Cart", "Buy Now", "Get It", etc.)
await stagehand.act('Add this product to the shopping cart');
// Back to Playwright for checkout (you control this flow)
await page.goto('/cart');
await page.click('#checkout-btn');
// Stagehand to handle the payment form (many different layouts)
await stagehand.act('Fill in the credit card number 4242424242424242');
await stagehand.act('Fill in expiry 12/28 and CVV 123');
await stagehand.act('Submit the payment');
// Playwright to confirm result
await page.waitForSelector('.order-confirmation', { timeout: 10000 });
const orderId = await page.textContent('.order-id');
Stagehand with Claude
Stagehand works with multiple model providers. Claude performs well for complex element identification:
import { Stagehand } from '@browserbasehq/stagehand';
const stagehand = new Stagehand({
env: 'LOCAL',
modelName: 'claude-3-5-sonnet-20241022',
modelClientOptions: {
apiKey: process.env.ANTHROPIC_API_KEY,
},
});
// Claude's visual understanding handles complex layouts
await stagehand.act('Find the settings for email notifications and disable weekly digests');
// Claude reads the page + screenshot and identifies the correct nested settings
Cost comparison per action:
Model | Cost per action (est) | Speed | Notes
gpt-4o-mini | $0.0003–$0.001 | Fast | Good for simple selectors
gpt-4o | $0.003–$0.01 | Medium | Best reliability
claude-3-5-sonnet | $0.003–$0.009 | Medium | Strong visual reasoning
claude-3-haiku | $0.0005–$0.002 | Fast | Budget option
Playwright: When It Wins
Playwright remains the right choice for these categories:
End-to-end test suites:
import { test, expect } from '@playwright/test';
test('user can create a project', async ({ page }) => {
await page.goto('/dashboard');
await page.click('[data-testid="new-project-btn"]');
await page.fill('[data-testid="project-name"]', 'Test Project');
await page.click('[data-testid="create-btn"]');
await expect(page.locator('[data-testid="project-title"]'))
.toHaveText('Test Project');
});
You own the data-testid attributes. No AI needed — Playwright handles this faster and for free.
High-volume structured scraping:
// Scraping 100K product pages — deterministic selectors, no AI needed
const products = [];
for (const url of productUrls) {
await page.goto(url);
products.push({
name: await page.textContent('h1.product-title'),
price: await page.textContent('[data-price]'),
sku: await page.getAttribute('[data-sku]', 'data-sku'),
});
}
At $0.003/action minimum, 100K pages × any AI actions = $300+ minimum. Playwright at $0 wins for volume.
Stagehand: When It Wins
Automating third-party sites you don't control:
// Automating a competitor's checkout flow for price monitoring
// Their selectors change weekly — AI adapts, Playwright breaks
await stagehand.act('Find the "Add to Cart" button for the Pro plan');
await stagehand.act('Complete checkout with test card 4242...');
Multi-step workflows on unfamiliar UIs:
// Onboarding automation — each SaaS has different UI patterns
const platforms = ['hubspot', 'salesforce', 'pipedrive'];
for (const platform of platforms) {
await page.goto(`https://${platform}.com/signup`);
await stagehand.act('Fill in the sign-up form with the company details');
await stagehand.act('Skip any onboarding tutorials');
await stagehand.act('Navigate to the API settings or integrations page');
const apiKey = await stagehand.extract({
instruction: 'Get the API key or access token',
schema: z.object({ apiKey: z.string() }),
});
}
Resilient long-running automations: When you need an automation to run daily for months without manual maintenance, Stagehand's self-healing selector cache handles site updates automatically.
Browser Use: A Different Category
Browser Use is sometimes compared to Stagehand, but they're different:
Stagehand: AI primitives layer — you write the workflow, AI handles element identification
Browser Use: Full AI agent — the AI plans AND executes the entire workflow
Use Stagehand: When you have a defined workflow and need reliable execution
Use Browser Use: When you describe the goal and want AI to figure out the steps
// Stagehand — you control the flow
await page.goto('https://app.example.com/reports');
await stagehand.act('Select last 30 days from the date filter');
await stagehand.act('Click Export CSV');
// Browser Use — AI controls the flow
const agent = new Agent({
task: 'Export the last 30 days of sales data from app.example.com as CSV',
llm: new ChatOpenAI({ modelName: 'gpt-4o' }),
});
await agent.run(); // AI navigates, finds reports, applies filter, exports
Cost Calculator
Estimating Stagehand costs for your use case:
Actions per workflow: 5
Runs per day: 100
Cache hit rate: 70% (after first week of warmup)
Model: gpt-4o ($0.005/action average)
Daily cost:
Uncached actions: 100 × 5 × 0.30 = 150 actions × $0.005 = $0.75/day
Total monthly: ~$22.50
For 1,000 runs/day with same workflow:
Uncached: 1,500 × $0.005 = $7.50/day
Monthly: ~$225
Cache warmup dramatically changes the economics — first week is most expensive, then costs plateau.
Comparison Table
| Feature | Playwright | Stagehand |
|---|---|---|
| Price | Free (open source) | AI costs per action |
| Speed per action | ~50ms | 1–5 seconds (uncached) / 50ms (cached) |
| Selector resilience | Breaks on site changes | Self-healing via AI + cache |
| Natural language | ❌ | ✅ (act(), observe(), extract()) |
| Structured extraction | Manual selector logic | extract() with Zod schema |
| Test framework | Built-in (Playwright Test) | No built-in test framework |
| Visual/screenshot reasoning | ❌ | ✅ (sends screenshots to LLM) |
| Self-healing | ❌ | ✅ (auto-caching) |
| Computer use | ❌ | Can integrate with OpenAI Computer Use |
| Cloud browsers | Playwright cloud offerings | Browserbase native |
| Learning curve | Medium | Low (English instructions) |
| v3 driver | Chromium/Firefox/WebKit | Any CDP driver |
Debugging and Observability
Stagehand's AI actions are harder to debug than Playwright's deterministic selectors. When stagehand.act('Click the submit button') fails, the error doesn't tell you which element the AI tried to click or why it couldn't find the right element. Two debugging approaches:
The verbose: 2 log level outputs the full prompt sent to the LLM, the LLM's response (which element it identified and why), and the resulting Playwright action. Reading these logs makes it clear whether the AI correctly identified the right element but Playwright failed to click it (timing issue), or whether the AI selected the wrong element entirely (prompt ambiguity or page structure issue).
Stagehand's observe() method is also useful for debugging: call it before act() to see what the AI thinks are the available actions on the current page. If the AI doesn't list the action you want, your act() prompt needs revision or the page structure is confusing the visual reasoning.
For CI/CD integration, Playwright's built-in HTML report and trace viewer work with Stagehand since Stagehand uses Playwright under the hood. Enable tracing in your config to capture a full video, screenshots, and network log of every test run — when an AI action fails in CI, the trace file shows exactly what the page looked like when the action was attempted.
CI/CD Integration
Stagehand adds latency to test runs that Playwright-only suites don't have. An uncached AI action takes 1-5 seconds; a full test suite with 50 AI actions could take 5+ minutes even before considering browser startup time. Two strategies for managing this in CI:
Warm the cache before CI runs. Run your Stagehand suite locally first to populate the selector cache, commit the cache directory (.stagehand-cache), and check it into your repository. CI reads from the committed cache, getting near-Playwright speeds for all cached actions. Only new or updated selectors pay the AI cost in CI.
Separate AI-backed tests from fast tests. Treat Stagehand tests as a separate test tier — slower, higher-level, run less frequently. Run Playwright unit tests on every PR; run Stagehand integration tests nightly or pre-release. This mirrors the testing pyramid for AI-backed automation: fast/deterministic tests run frequently, slow/AI tests run as a final gate.
Methodology
Stagehand version referenced: v3 (stable as of March 2026), which introduced the modular driver architecture removing the hard Playwright dependency. Playwright version: 1.44. Cost-per-action figures are estimates based on measured token counts (1,000-3,000 input tokens per action with a screenshot) and OpenAI GPT-4o/Anthropic pricing as of March 2026. Cache hit rate estimates based on Browserbase's published benchmarks for typical e-commerce automation workflows. Selector caching behavior and performance figures verified against the Stagehand GitHub repository's benchmark suite. Browser Use version referenced: 0.1.x (pre-1.0 as of writing). AI cost-per-action figures represent estimates for the typical page complexity of a modern SaaS application; simpler pages (fewer DOM elements, cleaner accessibility tree) produce shorter prompts and lower per-action costs, while complex pages with many interactive elements can exceed these estimates.
Browse all browser automation and scraping APIs at APIScout.
Related: Firecrawl vs Jina vs Apify: Best Scraping API 2026 · Best AI Agent APIs 2026, Best Browser Automation APIs 2026, Best Web Scraping APIs (2026), How AI Is Transforming API Design and Documentation