Skip to main content

The Global API Latency Map: Where Are APIs 2026

·APIScout Team
Share:

The Global API Latency Map: Where Are APIs Fastest?

Your API might respond in 50ms from your office. But what about from São Paulo? Lagos? Jakarta? For global applications, latency isn't one number — it's a map. Geography determines whether your API feels instant or sluggish, and the difference between regions can be 10x.

How API Latency Works

The Physics Problem

Light travels through fiber optic cable at roughly 200,000 km/s — about two-thirds the speed of light in a vacuum. That means:

RouteDistanceSpeed-of-Light MinimumRealistic RTT
NYC → London5,500 km28ms70-80ms
NYC → Tokyo10,800 km54ms150-180ms
NYC → Sydney16,000 km80ms200-250ms
NYC → Mumbai12,500 km63ms180-220ms
London → Singapore10,800 km54ms160-200ms
São Paulo → Tokyo18,500 km93ms280-320ms

These are minimum latencies. Real-world adds DNS lookup (10-50ms), TLS handshake (30-100ms), server processing, and multiple network hops.

What Makes Up API Latency

Total latency = DNS + TCP + TLS + Network transit + Server processing + Response transfer

Example (NYC → Singapore API call):
  DNS lookup:        10ms (cached) or 50ms (cold)
  TCP handshake:     80ms (one RTT)
  TLS handshake:    160ms (two RTTs)
  Request transit:   80ms
  Server processing: 20ms
  Response transit:  80ms
  ─────────────────────
  Total:           ~430ms (cold) or ~240ms (warm, with connection reuse)

API Latency by Region

Cloud Region Performance (2026)

Average response times for a simple REST API (CRUD, <1KB response) from each region:

Origin RegionTo US EastTo EU WestTo Asia EastTo AustraliaTo South America
US East (Virginia)5-15ms70-90ms150-200ms200-250ms120-160ms
US West (Oregon)60-80ms130-160ms100-140ms150-180ms150-180ms
EU West (Ireland)70-90ms5-15ms200-250ms280-320ms180-220ms
EU Central (Frankfurt)80-100ms10-20ms180-220ms260-300ms200-240ms
Asia East (Tokyo)150-200ms200-250ms5-15ms100-130ms280-320ms
Asia Southeast (Singapore)200-240ms160-200ms30-50ms80-120ms300-350ms
Australia (Sydney)200-250ms280-320ms100-130ms5-15ms300-350ms
South America (São Paulo)120-160ms180-220ms280-320ms300-350ms5-15ms

The Multi-Region Gap

A single-region API has wildly different performance for global users:

API hosted in US East (Virginia):
  User in New York:     15ms ✅
  User in London:       80ms ⚠️ Noticeable
  User in Tokyo:       180ms ❌ Feels slow
  User in Sydney:      230ms ❌ Frustrating
  User in São Paulo:   140ms ⚠️ Noticeable

Rule of thumb: Users notice latency above 100ms. Above 300ms feels broken.

How Major API Providers Perform

Global Latency Benchmarks

ProviderRegionsMedian Global LatencyP99 LatencyEdge PoPs
Cloudflare300+ cities15-30ms50-80ms300+
AWS CloudFront60+ regions20-40ms60-100ms400+
Google Cloud CDN40+ regions25-45ms70-120ms180+
Fastly30+ regions15-35ms50-90ms90+
Azure CDN120+ PoPs25-50ms80-130ms120+

API-Specific Provider Latency

API ProviderArchitectureUS LatencyEU LatencyAsia LatencyGlobal Avg
StripeMulti-region50-80ms50-80ms100-150ms80-100ms
TwilioMulti-region60-100ms60-100ms120-180ms90-120ms
Auth0Multi-region80-120ms80-120ms150-250ms100-150ms
AlgoliaDistributed search10-30ms10-30ms20-50ms15-35ms
OpenAIUS primary200-500ms250-600ms300-700ms300-600ms
AnthropicUS primary200-400ms250-500ms300-600ms250-500ms

Note: AI API latency is dominated by inference time, not network latency. The 200-600ms is mostly compute, not geography.

Strategies for Reducing Global Latency

1. Deploy to Multiple Regions

// Multi-region deployment with geo-routing
// DNS routes users to nearest region automatically

// vercel.json — auto-deploys to all edge regions
{
  "regions": ["iad1", "cdg1", "hnd1", "syd1", "gru1"]
}

// Or with Cloudflare Workers — runs in 300+ cities
export default {
  async fetch(request: Request) {
    // This code runs in the city nearest the user
    const data = await getFromNearestDB(request);
    return Response.json(data);
  }
};

2. Edge Caching

// Cache API responses at the edge — CDN serves from nearest PoP
export async function GET(request: Request) {
  const data = await fetchProducts();

  return Response.json(data, {
    headers: {
      'Cache-Control': 'public, s-maxage=60, stale-while-revalidate=300',
      'CDN-Cache-Control': 'public, max-age=300',
      'Surrogate-Control': 'max-age=3600',
    },
  });
}

// Impact:
// Without cache: 200ms (origin in Virginia → user in Tokyo)
// With edge cache: 15ms (Tokyo PoP → user in Tokyo)

3. Edge Database Patterns

SolutionTypeRead LatencyWrite LatencyConsistency
Cloudflare D1SQLite at edge<5ms<10ms (local)Eventual
TursoDistributed SQLite5-15ms50-100msStrong
PlanetScaleDistributed MySQL10-30ms30-80msStrong
NeonServerless Postgres20-50ms30-80msStrong
DynamoDB Global TablesKey-value5-10ms50-100msEventual
CockroachDBDistributed SQL10-30ms50-150msStrong

4. Connection Optimization

// Reduce handshake overhead with connection reuse
// HTTP/2 multiplexing — one connection, many requests
const agent = new https.Agent({
  keepAlive: true,
  maxSockets: 10,
  keepAliveMsecs: 60000,
});

// Savings per request with warm connection:
// Cold:  DNS(50) + TCP(80) + TLS(160) + Transit(80) = 370ms
// Warm:  Transit(80) = 80ms  (78% reduction)

5. Regional API Routing

// Route API calls to nearest provider region
const REGION_ENDPOINTS = {
  'us': 'https://us.api.example.com',
  'eu': 'https://eu.api.example.com',
  'ap': 'https://ap.api.example.com',
};

function getNearestEndpoint(userRegion: string): string {
  if (['US', 'CA', 'MX'].includes(userRegion)) return REGION_ENDPOINTS.us;
  if (['GB', 'DE', 'FR', 'NL', 'IT', 'ES'].includes(userRegion)) return REGION_ENDPOINTS.eu;
  return REGION_ENDPOINTS.ap;
}

// Or use DNS-based routing (Route 53, Cloudflare Load Balancing)
// Automatically routes to nearest healthy endpoint

6. Prefetching and Preconnecting

<!-- Reduce perceived latency with resource hints -->

<!-- DNS prefetch — saves 50ms on first request -->
<link rel="dns-prefetch" href="https://api.example.com" />

<!-- Preconnect — saves 200ms+ (DNS + TCP + TLS) -->
<link rel="preconnect" href="https://api.example.com" />

<!-- Prefetch data — fetches before user needs it -->
<link rel="prefetch" href="https://api.example.com/v1/config" />
// Client-side: preconnect to API on page load
if (typeof window !== 'undefined') {
  const link = document.createElement('link');
  link.rel = 'preconnect';
  link.href = 'https://api.example.com';
  document.head.appendChild(link);
}

Measuring API Latency

Tools

ToolWhat It MeasuresCoverage
PingdomHTTP response time100+ locations
UptrendsFull waterfall timing230+ checkpoints
ChecklyAPI monitoring from multiple regions20+ regions
CatchpointNetwork-level latency2,800+ nodes
ThousandEyesPath analysis and latencyISP-level visibility
Cloudflare RadarInternet latency trendsGlobal

DIY Global Latency Test

// Test your API from multiple regions using Cloudflare Workers
export default {
  async fetch(request: Request) {
    const endpoints = [
      { region: 'US East', url: 'https://us-east.api.example.com/health' },
      { region: 'EU West', url: 'https://eu-west.api.example.com/health' },
      { region: 'Asia', url: 'https://asia.api.example.com/health' },
    ];

    const results = await Promise.all(
      endpoints.map(async ({ region, url }) => {
        const start = performance.now();
        const response = await fetch(url);
        const latency = performance.now() - start;
        return { region, latency: Math.round(latency), status: response.status };
      })
    );

    return Response.json({
      measured_from: request.cf?.city,
      results,
    });
  },
};

Regional Optimization Playbook

Your Users Are InDeploy ToCache AtDatabase
US onlyus-east-1CloudFront USRDS / PlanetScale
US + EUus-east-1 + eu-west-1CloudFrontPlanetScale (multi-region)
GlobalEdge (Workers/Vercel)Every PoPTurso / D1 / DynamoDB Global
Asia-focusedap-northeast-1 + ap-southeast-1CloudFront APDynamoDB AP
Latency-criticalEvery major regionAggressive edge cacheRead replicas everywhere

The Business Case for Latency Investment

Latency optimization is frequently framed as a technical concern, but the business case is quantitative and well-documented across industries. Amazon's internal research — widely cited in web performance literature — found that every 100ms of additional latency reduced revenue by approximately 1%. Google documented that a 500ms increase in search latency reduced searches by 20%. Akamai's research found 1-second delays reduce customer satisfaction by 16% and conversions by 7%.

For API-heavy applications, the relationship between latency and business outcomes is particularly direct. An API powering a checkout flow, a search experience, or a real-time collaborative tool creates a measurable line between API response time and revenue impact. A checkout API that adds 300ms of latency from button click to confirmation degrades conversion even when users don't consciously notice the delay.

The ROI calculation for latency investment is straightforward: measure current conversion rate by latency bucket, quantify how many users are experiencing elevated latency due to geographic distance, project the revenue impact of moving them from high-latency to low-latency, then compare against the infrastructure cost of a CDN layer or a second region. For most applications with meaningful global traffic, this math favors multi-region or edge deployment long before traffic volumes feel large enough to justify the operational complexity. A $200/month additional region that improves conversion for 20% of users typically pays for itself within a few weeks.

The counterargument is real: distributed infrastructure adds complexity, and complexity introduces failure modes. The right decision is region-by-region, driven by where your actual users are, not by the assumption that global distribution is always better.

Edge vs Multi-Region: Choosing the Right Architecture

The choice between edge deployment (Cloudflare Workers, Vercel Edge Functions, Fastly Compute) and traditional multi-region deployment (4-6 cloud regions with load balancing) involves tradeoffs that aren't immediately obvious from the latency numbers alone.

Edge deployment places compute in 100-300+ cities globally, achieving sub-30ms response times from almost anywhere on Earth. The constraint: edge runtimes are sandboxed environments with meaningful restrictions. Cloudflare Workers limit CPU time to 10-50ms per invocation, cap memory at 128MB, and restrict certain Node.js APIs. Full-featured database queries, session state, and CPU-intensive operations don't run at the edge — edge functions can only read from fast nearby data sources (KV stores, edge caches, or replicas with sub-10ms read latency from the edge node).

Multi-region deployment across 4-6 traditional cloud regions runs full application logic with no runtime sandbox restrictions, at the cost of moderately higher latency (10-50ms to the nearest region rather than 5-15ms to the nearest edge node) and more infrastructure complexity. This is the right architecture when requests require database writes, session management, or compute that exceeds edge runtime limits.

The architecture most teams end up with: edge functions for static assets, configuration fetches, and read-heavy endpoints with aggressive cache headers; regional deployment for authenticated routes, mutations, and any logic that requires full runtime access. This hybrid captures most of the latency benefit for the majority of requests while avoiding the edge sandbox constraints that would otherwise require significant application restructuring.

Common Mistakes

MistakeImpactFix
Single-region deployment for global users200-300ms for distant usersMulti-region or edge deployment
No CDN for API responsesEvery request hits originAdd Cache-Control headers + CDN
Cold DNS lookups+50ms per new domainDNS prefetch + keep-alive
No connection reuse+200ms TLS handshake per requestHTTP/2 + keep-alive
Measuring latency only from your officeMiss real-world experienceMonitor from multiple global locations
Ignoring P99 latency1% of users have terrible experienceOptimize tail latency, not just median

Compare API providers by regional performance on APIScout — latency benchmarks, uptime data, and global coverage maps.

Related: Best CDN APIs for Developers in 2026, Building Multi-Region APIs 2026, Cloudflare Workers vs AWS Lambda@Edge vs Fastly Compute

The API Integration Checklist (Free PDF)

Step-by-step checklist: auth setup, rate limit handling, error codes, SDK evaluation, and pricing comparison for 50+ APIs. Used by 200+ developers.

Join 200+ developers. Unsubscribe in one click.