The Global API Latency Map: Where Are APIs Fastest?

Your API might respond in 50ms from your office. But what about from São Paulo? Lagos? Jakarta? For global applications, latency isn't one number — it's a map. Geography determines whether your API feels instant or sluggish, and the difference between regions can be 10x.

How API Latency Works

The Physics Problem

Light travels through fiber optic cable at roughly 200,000 km/s — about two-thirds the speed of light in a vacuum. That means:

Route	Distance	Speed-of-Light Minimum	Realistic RTT
NYC → London	5,500 km	28ms	70-80ms
NYC → Tokyo	10,800 km	54ms	150-180ms
NYC → Sydney	16,000 km	80ms	200-250ms
NYC → Mumbai	12,500 km	63ms	180-220ms
London → Singapore	10,800 km	54ms	160-200ms
São Paulo → Tokyo	18,500 km	93ms	280-320ms

These are minimum latencies. Real-world adds DNS lookup (10-50ms), TLS handshake (30-100ms), server processing, and multiple network hops.

What Makes Up API Latency

Total latency = DNS + TCP + TLS + Network transit + Server processing + Response transfer

Example (NYC → Singapore API call):
  DNS lookup:        10ms (cached) or 50ms (cold)
  TCP handshake:     80ms (one RTT)
  TLS handshake:    160ms (two RTTs)
  Request transit:   80ms
  Server processing: 20ms
  Response transit:  80ms
  ─────────────────────
  Total:           ~430ms (cold) or ~240ms (warm, with connection reuse)

API Latency by Region

Cloud Region Performance (2026)

Average response times for a simple REST API (CRUD, <1KB response) from each region:

Origin Region	To US East	To EU West	To Asia East	To Australia	To South America
US East (Virginia)	5-15ms	70-90ms	150-200ms	200-250ms	120-160ms
US West (Oregon)	60-80ms	130-160ms	100-140ms	150-180ms	150-180ms
EU West (Ireland)	70-90ms	5-15ms	200-250ms	280-320ms	180-220ms
EU Central (Frankfurt)	80-100ms	10-20ms	180-220ms	260-300ms	200-240ms
Asia East (Tokyo)	150-200ms	200-250ms	5-15ms	100-130ms	280-320ms
Asia Southeast (Singapore)	200-240ms	160-200ms	30-50ms	80-120ms	300-350ms
Australia (Sydney)	200-250ms	280-320ms	100-130ms	5-15ms	300-350ms
South America (São Paulo)	120-160ms	180-220ms	280-320ms	300-350ms	5-15ms

The Multi-Region Gap

A single-region API has wildly different performance for global users:

API hosted in US East (Virginia):
  User in New York:     15ms ✅
  User in London:       80ms ⚠️ Noticeable
  User in Tokyo:       180ms ❌ Feels slow
  User in Sydney:      230ms ❌ Frustrating
  User in São Paulo:   140ms ⚠️ Noticeable

Rule of thumb: Users notice latency above 100ms. Above 300ms feels broken.

How Major API Providers Perform

Global Latency Benchmarks

Provider	Regions	Median Global Latency	P99 Latency	Edge PoPs
Cloudflare	300+ cities	15-30ms	50-80ms	300+
AWS CloudFront	60+ regions	20-40ms	60-100ms	400+
Google Cloud CDN	40+ regions	25-45ms	70-120ms	180+
Fastly	30+ regions	15-35ms	50-90ms	90+
Azure CDN	120+ PoPs	25-50ms	80-130ms	120+

API-Specific Provider Latency

API Provider	Architecture	US Latency	EU Latency	Asia Latency	Global Avg
Stripe	Multi-region	50-80ms	50-80ms	100-150ms	80-100ms
Twilio	Multi-region	60-100ms	60-100ms	120-180ms	90-120ms
Auth0	Multi-region	80-120ms	80-120ms	150-250ms	100-150ms
Algolia	Distributed search	10-30ms	10-30ms	20-50ms	15-35ms
OpenAI	US primary	200-500ms	250-600ms	300-700ms	300-600ms
Anthropic	US primary	200-400ms	250-500ms	300-600ms	250-500ms

Note: AI API latency is dominated by inference time, not network latency. The 200-600ms is mostly compute, not geography.

Strategies for Reducing Global Latency

1. Deploy to Multiple Regions

// Multi-region deployment with geo-routing
// DNS routes users to nearest region automatically

// vercel.json — auto-deploys to all edge regions
{
  "regions": ["iad1", "cdg1", "hnd1", "syd1", "gru1"]
}

// Or with Cloudflare Workers — runs in 300+ cities
export default {
  async fetch(request: Request) {
    // This code runs in the city nearest the user
    const data = await getFromNearestDB(request);
    return Response.json(data);
  }
};

2. Edge Caching

// Cache API responses at the edge — CDN serves from nearest PoP
export async function GET(request: Request) {
  const data = await fetchProducts();

  return Response.json(data, {
    headers: {
      'Cache-Control': 'public, s-maxage=60, stale-while-revalidate=300',
      'CDN-Cache-Control': 'public, max-age=300',
      'Surrogate-Control': 'max-age=3600',
    },
  });
}

// Impact:
// Without cache: 200ms (origin in Virginia → user in Tokyo)
// With edge cache: 15ms (Tokyo PoP → user in Tokyo)

3. Edge Database Patterns

Solution	Type	Read Latency	Write Latency	Consistency
Cloudflare D1	SQLite at edge	<5ms	<10ms (local)	Eventual
Turso	Distributed SQLite	5-15ms	50-100ms	Strong
PlanetScale	Distributed MySQL	10-30ms	30-80ms	Strong
Neon	Serverless Postgres	20-50ms	30-80ms	Strong
DynamoDB Global Tables	Key-value	5-10ms	50-100ms	Eventual
CockroachDB	Distributed SQL	10-30ms	50-150ms	Strong

4. Connection Optimization

// Reduce handshake overhead with connection reuse
// HTTP/2 multiplexing — one connection, many requests
const agent = new https.Agent({
  keepAlive: true,
  maxSockets: 10,
  keepAliveMsecs: 60000,
});

// Savings per request with warm connection:
// Cold:  DNS(50) + TCP(80) + TLS(160) + Transit(80) = 370ms
// Warm:  Transit(80) = 80ms  (78% reduction)

5. Regional API Routing

// Route API calls to nearest provider region
const REGION_ENDPOINTS = {
  'us': 'https://us.api.example.com',
  'eu': 'https://eu.api.example.com',
  'ap': 'https://ap.api.example.com',
};

function getNearestEndpoint(userRegion: string): string {
  if (['US', 'CA', 'MX'].includes(userRegion)) return REGION_ENDPOINTS.us;
  if (['GB', 'DE', 'FR', 'NL', 'IT', 'ES'].includes(userRegion)) return REGION_ENDPOINTS.eu;
  return REGION_ENDPOINTS.ap;
}

// Or use DNS-based routing (Route 53, Cloudflare Load Balancing)
// Automatically routes to nearest healthy endpoint

6. Prefetching and Preconnecting

<!-- Reduce perceived latency with resource hints -->

<!-- DNS prefetch — saves 50ms on first request -->
<link rel="dns-prefetch" href="https://api.example.com" />

<!-- Preconnect — saves 200ms+ (DNS + TCP + TLS) -->
<link rel="preconnect" href="https://api.example.com" />

<!-- Prefetch data — fetches before user needs it -->
<link rel="prefetch" href="https://api.example.com/v1/config" />

// Client-side: preconnect to API on page load
if (typeof window !== 'undefined') {
  const link = document.createElement('link');
  link.rel = 'preconnect';
  link.href = 'https://api.example.com';
  document.head.appendChild(link);
}

Measuring API Latency

Tools

Tool	What It Measures	Coverage
Pingdom	HTTP response time	100+ locations
Uptrends	Full waterfall timing	230+ checkpoints
Checkly	API monitoring from multiple regions	20+ regions
Catchpoint	Network-level latency	2,800+ nodes
ThousandEyes	Path analysis and latency	ISP-level visibility
Cloudflare Radar	Internet latency trends	Global

DIY Global Latency Test

// Test your API from multiple regions using Cloudflare Workers
export default {
  async fetch(request: Request) {
    const endpoints = [
      { region: 'US East', url: 'https://us-east.api.example.com/health' },
      { region: 'EU West', url: 'https://eu-west.api.example.com/health' },
      { region: 'Asia', url: 'https://asia.api.example.com/health' },
    ];

    const results = await Promise.all(
      endpoints.map(async ({ region, url }) => {
        const start = performance.now();
        const response = await fetch(url);
        const latency = performance.now() - start;
        return { region, latency: Math.round(latency), status: response.status };
      })
    );

    return Response.json({
      measured_from: request.cf?.city,
      results,
    });
  },
};

Regional Optimization Playbook

Your Users Are In	Deploy To	Cache At	Database
US only	us-east-1	CloudFront US	RDS / PlanetScale
US + EU	us-east-1 + eu-west-1	CloudFront	PlanetScale (multi-region)
Global	Edge (Workers/Vercel)	Every PoP	Turso / D1 / DynamoDB Global
Asia-focused	ap-northeast-1 + ap-southeast-1	CloudFront AP	DynamoDB AP
Latency-critical	Every major region	Aggressive edge cache	Read replicas everywhere

The Business Case for Latency Investment

Latency optimization is frequently framed as a technical concern, but the business case is quantitative and well-documented across industries. Amazon's internal research — widely cited in web performance literature — found that every 100ms of additional latency reduced revenue by approximately 1%. Google documented that a 500ms increase in search latency reduced searches by 20%. Akamai's research found 1-second delays reduce customer satisfaction by 16% and conversions by 7%.

For API-heavy applications, the relationship between latency and business outcomes is particularly direct. An API powering a checkout flow, a search experience, or a real-time collaborative tool creates a measurable line between API response time and revenue impact. A checkout API that adds 300ms of latency from button click to confirmation degrades conversion even when users don't consciously notice the delay.

The ROI calculation for latency investment is straightforward: measure current conversion rate by latency bucket, quantify how many users are experiencing elevated latency due to geographic distance, project the revenue impact of moving them from high-latency to low-latency, then compare against the infrastructure cost of a CDN layer or a second region. For most applications with meaningful global traffic, this math favors multi-region or edge deployment long before traffic volumes feel large enough to justify the operational complexity. A $200/month additional region that improves conversion for 20% of users typically pays for itself within a few weeks.

The counterargument is real: distributed infrastructure adds complexity, and complexity introduces failure modes. The right decision is region-by-region, driven by where your actual users are, not by the assumption that global distribution is always better.

Edge vs Multi-Region: Choosing the Right Architecture

The choice between edge deployment (Cloudflare Workers, Vercel Edge Functions, Fastly Compute) and traditional multi-region deployment (4-6 cloud regions with load balancing) involves tradeoffs that aren't immediately obvious from the latency numbers alone.

Edge deployment places compute in 100-300+ cities globally, achieving sub-30ms response times from almost anywhere on Earth. The constraint: edge runtimes are sandboxed environments with meaningful restrictions. Cloudflare Workers limit CPU time to 10-50ms per invocation, cap memory at 128MB, and restrict certain Node.js APIs. Full-featured database queries, session state, and CPU-intensive operations don't run at the edge — edge functions can only read from fast nearby data sources (KV stores, edge caches, or replicas with sub-10ms read latency from the edge node).

Multi-region deployment across 4-6 traditional cloud regions runs full application logic with no runtime sandbox restrictions, at the cost of moderately higher latency (10-50ms to the nearest region rather than 5-15ms to the nearest edge node) and more infrastructure complexity. This is the right architecture when requests require database writes, session management, or compute that exceeds edge runtime limits.

The architecture most teams end up with: edge functions for static assets, configuration fetches, and read-heavy endpoints with aggressive cache headers; regional deployment for authenticated routes, mutations, and any logic that requires full runtime access. This hybrid captures most of the latency benefit for the majority of requests while avoiding the edge sandbox constraints that would otherwise require significant application restructuring.

Common Mistakes

Mistake	Impact	Fix
Single-region deployment for global users	200-300ms for distant users	Multi-region or edge deployment
No CDN for API responses	Every request hits origin	Add Cache-Control headers + CDN
Cold DNS lookups	+50ms per new domain	DNS prefetch + keep-alive
No connection reuse	+200ms TLS handshake per request	HTTP/2 + keep-alive
Measuring latency only from your office	Miss real-world experience	Monitor from multiple global locations
Ignoring P99 latency	1% of users have terrible experience	Optimize tail latency, not just median

Compare API providers by regional performance on APIScout — latency benchmarks, uptime data, and global coverage maps.

The Global API Latency Map: Where Are APIs 2026