How to Test API Integrations Without Hitting Production

Testing API integrations against live APIs is slow, expensive, flaky, and can produce real side effects. Send a test payment through Stripe's live API and you charge a real card. Hit OpenAI's API 10,000 times in CI and you spend real money. The solution: test without touching production.

The five strategies below form a complete testing approach that covers the full range of API integration scenarios. They are not mutually exclusive — a well-tested integration uses all five at different layers. The key insight is that each strategy has a different cost/coverage tradeoff. Mock-based tests are fast but can drift from reality; sandbox tests reflect real API behavior but are slow and require credentials; contract tests run against the live API but only verify schema, not business logic. Use them together: mocks for speed in CI, contract tests weekly to catch drift, sandbox for full E2E confidence before shipping.

The Testing Pyramid for API Integrations

           ╱╲
          ╱  ╲     End-to-End (Sandbox)
         ╱    ╲    Real API, test environment
        ╱──────╲
       ╱        ╲   Contract Tests
      ╱          ╲  Verify API contract hasn't changed
     ╱────────────╲
    ╱              ╲ Integration Tests (Mocked)
   ╱                ╲ Mock server, recorded responses
  ╱──────────────────╲
 ╱                    ╲ Unit Tests
╱                      ╲ Pure functions, no API calls

Strategy 1: Sandbox Environments

Most major APIs provide test/sandbox environments:

Provider	Sandbox	How to Use
Stripe	Test mode	Use `sk_test_` keys instead of `sk_live_`
PayPal	Sandbox	sandbox.paypal.com, separate test accounts
Twilio	Magic numbers	Use specific test phone numbers
Auth0	Dev tenant	Separate tenant for testing
Plaid	Sandbox	sandbox.plaid.com, test credentials
DocuSign	Demo	demo.docusign.net
Square	Sandbox	Use sandbox application ID

// Environment-based API configuration
const config = {
  stripe: {
    apiKey: process.env.NODE_ENV === 'production'
      ? process.env.STRIPE_LIVE_KEY
      : process.env.STRIPE_TEST_KEY,
  },
  plaid: {
    baseUrl: process.env.NODE_ENV === 'production'
      ? 'https://production.plaid.com'
      : 'https://sandbox.plaid.com',
  },
};

// Stripe test mode — use test card numbers
// 4242424242424242 → Always succeeds
// 4000000000000002 → Always declines
// 4000000000009995 → Insufficient funds

When to use sandbox: End-to-end tests, integration tests that need realistic API behavior, manual QA.

Strategy 2: API Mocking with MSW

Mock Service Worker (MSW) intercepts network requests and returns mock responses:

// mocks/handlers.ts
import { http, HttpResponse } from 'msw';

export const handlers = [
  // Mock Stripe customer creation
  http.post('https://api.stripe.com/v1/customers', async ({ request }) => {
    const body = await request.text();
    const params = new URLSearchParams(body);

    return HttpResponse.json({
      id: 'cus_test_123',
      email: params.get('email'),
      name: params.get('name'),
      created: Math.floor(Date.now() / 1000),
    });
  }),

  // Mock Stripe payment intent
  http.post('https://api.stripe.com/v1/payment_intents', async () => {
    return HttpResponse.json({
      id: 'pi_test_456',
      status: 'succeeded',
      amount: 2000,
      currency: 'usd',
    });
  }),

  // Mock error response
  http.post('https://api.stripe.com/v1/charges', async () => {
    return HttpResponse.json(
      { error: { type: 'card_error', code: 'card_declined', message: 'Your card was declined.' } },
      { status: 402 }
    );
  }),

  // Mock Resend email
  http.post('https://api.resend.com/emails', async () => {
    return HttpResponse.json({
      id: 'email_test_789',
    });
  }),
];

// mocks/server.ts
import { setupServer } from 'msw/node';
import { handlers } from './handlers';

export const server = setupServer(...handlers);

// tests/setup.ts
import { server } from '../mocks/server';

beforeAll(() => server.listen());
afterEach(() => server.resetHandlers());
afterAll(() => server.close());

// tests/payment.test.ts
import { server } from '../mocks/server';
import { http, HttpResponse } from 'msw';
import { processPayment } from '../src/payment';

test('successful payment', async () => {
  const result = await processPayment({
    amount: 2000,
    currency: 'usd',
    customerId: 'cus_123',
  });

  expect(result.status).toBe('succeeded');
  expect(result.id).toBe('pi_test_456');
});

test('handles card decline', async () => {
  // Override handler for this specific test
  server.use(
    http.post('https://api.stripe.com/v1/payment_intents', () => {
      return HttpResponse.json(
        { error: { code: 'card_declined' } },
        { status: 402 }
      );
    })
  );

  await expect(processPayment({ amount: 2000, currency: 'usd' }))
    .rejects.toThrow('Card declined');
});

test('retries on server error', async () => {
  let attempts = 0;

  server.use(
    http.post('https://api.stripe.com/v1/payment_intents', () => {
      attempts++;
      if (attempts < 3) {
        return HttpResponse.json({}, { status: 500 });
      }
      return HttpResponse.json({ id: 'pi_retry', status: 'succeeded' });
    })
  );

  const result = await processPayment({ amount: 2000, currency: 'usd' });
  expect(result.status).toBe('succeeded');
  expect(attempts).toBe(3);
});

Strategy 3: Record and Replay

Record real API responses once, replay them in tests:

// Using Polly.js for record/replay
import { Polly } from '@pollyjs/core';
import NodeHTTPAdapter from '@pollyjs/adapter-node-http';
import FSPersister from '@pollyjs/persister-fs';

Polly.register(NodeHTTPAdapter);
Polly.register(FSPersister);

describe('API integration', () => {
  let polly: Polly;

  beforeEach(() => {
    polly = new Polly('stripe-integration', {
      adapters: ['node-http'],
      persister: 'fs',
      persisterOptions: {
        fs: { recordingsDir: '__recordings__' },
      },
      recordIfMissing: process.env.RECORD === 'true',
      matchRequestsBy: {
        headers: false, // Don't match on auth headers
        body: true,
        url: { pathname: true, query: true },
      },
    });
  });

  afterEach(async () => {
    await polly.stop();
  });

  test('creates a customer', async () => {
    // First run with RECORD=true: hits real API, saves response
    // Subsequent runs: replays saved response
    const customer = await stripe.customers.create({
      email: 'test@example.com',
    });

    expect(customer.email).toBe('test@example.com');
  });
});

How it works:

Run tests with RECORD=true → hits real API, saves responses to disk
Run tests normally → replays saved responses (no network calls)
Re-record periodically to catch API changes

Strategy 4: Contract Testing

Verify that the API contract hasn't changed:

// Contract tests verify API shape, not business logic

import { z } from 'zod';

// Define expected API contracts
const StripeCustomerSchema = z.object({
  id: z.string().startsWith('cus_'),
  object: z.literal('customer'),
  email: z.string().email().nullable(),
  name: z.string().nullable(),
  created: z.number(),
  metadata: z.record(z.string()).optional(),
});

const StripeErrorSchema = z.object({
  error: z.object({
    type: z.string(),
    code: z.string().optional(),
    message: z.string(),
    param: z.string().optional(),
  }),
});

// Contract test — runs against sandbox
describe('Stripe API Contract', () => {
  test('customer creation returns expected shape', async () => {
    const customer = await stripe.customers.create({
      email: 'contract-test@example.com',
    });

    const result = StripeCustomerSchema.safeParse(customer);
    expect(result.success).toBe(true);

    // Cleanup
    await stripe.customers.del(customer.id);
  });

  test('invalid request returns expected error shape', async () => {
    try {
      await stripe.customers.create({ email: 'not-an-email' });
    } catch (error: any) {
      const result = StripeErrorSchema.safeParse({ error: error.raw });
      expect(result.success).toBe(true);
    }
  });
});

When to run contract tests: In CI, weekly or on-demand. Not on every commit (they hit real APIs and are slow). A contract test failure is a signal to investigate — not necessarily a blocking failure. Triage it: did the API add a new optional field (safe to ignore), change a required field type (breaking), or remove a field your code reads (breaking)? Update your schemas and parsing code accordingly. The value of contract tests is early warning, not hard blocking.

Strategy 5: API Client Abstraction for Testing

Design your code so the API client is swappable:

// Interface — your code depends on this
interface PaymentService {
  createCustomer(email: string): Promise<{ id: string; email: string }>;
  chargeCustomer(customerId: string, amount: number): Promise<{ id: string; status: string }>;
}

// Real implementation
class StripePaymentService implements PaymentService {
  constructor(private stripe: Stripe) {}

  async createCustomer(email: string) {
    const customer = await this.stripe.customers.create({ email });
    return { id: customer.id, email: customer.email! };
  }

  async chargeCustomer(customerId: string, amount: number) {
    const intent = await this.stripe.paymentIntents.create({
      customer: customerId,
      amount,
      currency: 'usd',
    });
    return { id: intent.id, status: intent.status };
  }
}

// Test implementation — no API calls
class MockPaymentService implements PaymentService {
  customers: Map<string, { id: string; email: string }> = new Map();
  charges: Array<{ id: string; customerId: string; amount: number }> = [];

  async createCustomer(email: string) {
    const id = `cus_mock_${Date.now()}`;
    const customer = { id, email };
    this.customers.set(id, customer);
    return customer;
  }

  async chargeCustomer(customerId: string, amount: number) {
    const id = `pi_mock_${Date.now()}`;
    this.charges.push({ id, customerId, amount });
    return { id, status: 'succeeded' };
  }
}

// Tests use mock — fast, deterministic, no API calls
test('checkout flow', async () => {
  const payments = new MockPaymentService();
  const checkout = new CheckoutService(payments);

  const result = await checkout.processOrder({
    email: 'test@example.com',
    items: [{ id: 'prod_1', quantity: 1, price: 2000 }],
  });

  expect(result.status).toBe('succeeded');
  expect(payments.charges).toHaveLength(1);
  expect(payments.charges[0].amount).toBe(2000);
});

Testing Strategy by Layer

Layer	What to Test	How	Speed
Unit tests	Business logic, data transformation	No mocking needed (pure functions)	Instant
Integration (mocked)	Request building, error handling, retries	MSW or dependency injection	Fast
Integration (recorded)	Real API behavior, response parsing	Record/replay (Polly.js)	Fast (replay)
Contract	API hasn't changed	Run against sandbox	Slow (real API)
E2E (sandbox)	Full flow works	Real sandbox API	Slow

Testing Webhooks

// Test webhook handler without waiting for real webhooks

test('handles payment_intent.succeeded webhook', async () => {
  const event = {
    id: 'evt_test_123',
    type: 'payment_intent.succeeded',
    data: {
      object: {
        id: 'pi_test_456',
        amount: 2000,
        status: 'succeeded',
        customer: 'cus_test_789',
      },
    },
  };

  // Generate valid test signature
  const payload = JSON.stringify(event);
  const signature = stripe.webhooks.generateTestHeaderString({
    payload,
    secret: WEBHOOK_SECRET,
  });

  const response = await app.inject({
    method: 'POST',
    url: '/webhooks/stripe',
    headers: {
      'stripe-signature': signature,
      'content-type': 'application/json',
    },
    body: payload,
  });

  expect(response.statusCode).toBe(200);

  // Verify side effects
  const order = await db.orders.findByPaymentIntent('pi_test_456');
  expect(order.status).toBe('paid');
});

CI/CD Pipeline Configuration

# .github/workflows/test.yml
jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm test -- --filter=unit

  integration-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm test -- --filter=integration
      # Uses MSW mocks and recorded responses — no API keys needed

  contract-tests:
    runs-on: ubuntu-latest
    # Only run weekly or on explicit trigger
    if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
    steps:
      - uses: actions/checkout@v4
      - run: npm test -- --filter=contract
    env:
      STRIPE_TEST_KEY: ${{ secrets.STRIPE_TEST_KEY }}

Designing for Testability

The biggest factor in how easy API integrations are to test isn't your test framework — it's how the integration code is written. Untestable integrations usually share a few characteristics: API calls are scattered across the codebase rather than centralized in a service layer, the HTTP client is imported and called directly in business logic, and there's no abstraction between the code that decides what to do and the code that makes the request.

The core design principle is the dependency inversion: your business logic should depend on an interface, not a concrete API client. If checkoutService.processOrder() directly imports and calls stripe.paymentIntents.create(), testing it requires either mocking Stripe at the module level (fragile) or hitting the real API (slow). If instead checkoutService accepts a PaymentService interface (as shown in Strategy 5 above), you can inject a MockPaymentService in tests and the real StripePaymentService in production.

This pattern has a secondary benefit: when Stripe changes their API or you want to evaluate a competitor like Paddle or Lemon Squeezy, you swap the implementation rather than hunting through 30 call sites. For greenfield integrations, structure the service layer before writing tests — retrofitting abstraction onto tightly coupled code is significantly more expensive than designing for it upfront.

Seam points matter: identify exactly where your code crosses the boundary from "our system" to "external API" and ensure each seam is testable. For most integrations, the seams are: outbound HTTP requests (test with MSW or recorded responses), inbound webhooks (test by calling your handler directly with a test event), and scheduled jobs that poll APIs (test by injecting a fake API client into the job runner).

Maintaining Test Quality Over Time

API integrations tests have a peculiar failure mode: they pass reliably for months, then suddenly fail when the API provider changes their response format. Unlike unit tests that fail immediately when code changes, integration tests can become stale — your mocks reflect a response format from 18 months ago while the live API has evolved.

Refresh recorded responses periodically: If you use Polly.js or a similar record/replay library, schedule periodic re-recording. A good cadence: re-record when the provider announces API changes, when contract tests start failing, or on a quarterly schedule for stable APIs. Add a recorded-on date to recording files so you know when they were captured. Stale recordings older than 6 months for frequently-changing APIs (OpenAI, GitHub) are a reliability risk.

Contract tests as early warning: Run contract tests on a weekly schedule in CI even if you don't run them on every commit. A contract test failure doesn't necessarily mean your integration is broken — the API may have added optional new fields that your code ignores — but it triggers a review. Investigate within 48 hours of a contract test failure: determine if the change is breaking, additive, or irrelevant, and update your mocks and parsing code accordingly.

Test the error path as rigorously as the happy path: The most common production incident pattern is: happy path works, but an error path (expired token, rate limited, service temporarily unavailable) hits code that was never tested and throws an unhandled exception. For every API call in your codebase, there should be at least one test that exercises what happens when that call returns a 4xx or 5xx. MSW makes this easy — override the default handler to return an error for that specific test.

Separate slow and fast tests explicitly: Tag integration tests that hit real APIs (sandbox or recorded with network calls) differently from pure unit tests and mock-based integration tests. Your CI pipeline should run fast tests on every commit (seconds) and slow tests on a schedule or on-demand (minutes). Developers who wait 10 minutes for tests to pass locally will start skipping them; keep the fast-feedback loop under 30 seconds. Jest supports this with --testPathPattern or custom --testNamePattern flags; Vitest uses workspaces to split test configurations explicitly.

Methodology

MSW (Mock Service Worker) v2.x uses the http and HttpResponse API shown in the examples above — the v1.x rest and ctx API is deprecated and removed in v2. The setupServer function in msw/node works for Node.js test environments (Jest, Vitest, Node); the browser equivalent setupWorker is used for browser-based testing. Polly.js is the maintained fork of the Netflix library and is compatible with Jest, Mocha, and any Node.js HTTP client. The Zod-based contract testing approach requires Zod v3.x; the schema definitions validate shape and type but not business-logic invariants. Stripe's webhooks.generateTestHeaderString() method is available in stripe Node.js SDK v8+. The testing pyramid ordering reflects industry convention from Mike Cohn; the distribution of tests by layer (many unit, fewer integration, fewest E2E) holds for most API integration scenarios, though RAG and AI integrations often invert this due to the difficulty of unit-testing prompt behavior.

Mistake	Impact	Fix
Testing against live API in CI	Real side effects, costs money	Use sandbox or mocks
Mocking at wrong level	Tests pass but integration breaks	Mock at HTTP level, not function level
Not testing error cases	App crashes on first API error	Mock 4xx, 5xx, timeouts
Hard-coding mock responses	Tests pass when API changes	Use recorded responses, refresh periodically
No webhook testing	Webhook bugs found in production	Generate test events with signatures
Testing only happy path	Miss edge cases	Test rate limits, timeouts, malformed responses

Find APIs with the best sandbox and testing environments on APIScout — sandbox availability, test credentials, and developer experience scores.

How to Test API Integrations Without Production 2026