Building Multi-Tenant APIs: Architecture Patterns 2026
Building Multi-Tenant APIs: Architecture Patterns
Multi-tenant APIs serve multiple customers (tenants) from a single deployment. Every B2B SaaS product is multi-tenant. The core challenge: keeping tenant data isolated while sharing infrastructure efficiently. Get it wrong and Customer A sees Customer B's data.
TL;DR
- Shared schema (single DB,
tenant_ideverywhere) is the right starting point for most SaaS — cheapest, simplest, scales to thousands of tenants - Row Level Security in PostgreSQL is your safety net for shared schema; enforce it at the database layer, not only the application layer
- Tenant resolution should happen in middleware before any business logic runs — subdomain, JWT claim, or API key lookup
- Noisy neighbor is your first scaling problem; PgBouncer connection pooling and per-tenant rate limits solve 80% of cases
- Separate databases are only justified for enterprise compliance tiers (HIPAA, SOC 2 Type II isolation requirements), not for typical SaaS growth
Database Isolation Strategies
1. Shared Database, Shared Schema
All tenants share one database and tables. A tenant_id column on every table separates data.
SELECT * FROM orders WHERE tenant_id = 'tenant_123' AND id = 'order_456';
Pros: Simplest. Cheapest. Easy to deploy and maintain.
Cons: Hardest to isolate. One missing WHERE tenant_id = clause leaks data. Noisy neighbor risk (one tenant's heavy query slows everyone).
Use when: Early-stage SaaS, small tenants, cost matters more than isolation.
2. Shared Database, Separate Schemas
Each tenant gets their own database schema (namespace). Tables are identical but isolated.
SET search_path TO tenant_123;
SELECT * FROM orders WHERE id = 'order_456';
Pros: Better isolation than shared schema. Schema-level security. Easier per-tenant migrations. Cons: Many schemas to manage. Connection pooling across schemas is complex. Backup/restore is per-schema.
Use when: Medium-sized tenants needing data isolation without separate databases.
3. Separate Databases
Each tenant gets their own database instance. Complete isolation.
Pros: Strongest isolation. Per-tenant performance tuning. Simplest compliance story. Cons: Most expensive. Connection management across databases. Cross-tenant queries impossible.
Use when: Enterprise tenants, regulated industries (healthcare, finance), tenants with specific compliance requirements.
API-Level Patterns
Tenant Resolution
How does the API know which tenant a request belongs to?
| Method | Example | Best For |
|---|---|---|
| Subdomain | acme.api.example.com | Customer-facing APIs |
| Header | X-Tenant-Id: acme | Internal/B2B APIs |
| API key | Key maps to tenant | Developer APIs |
| JWT claim | tenant_id in token | OAuth-based APIs |
| URL path | /api/tenants/acme/users | Admin APIs |
Tenant-Scoped Authorization
Every request must be scoped to the authenticated tenant. Implement as middleware that runs before any business logic.
Per-Tenant Rate Limiting
Different tenants get different rate limits based on their plan:
Free tenant: 100 requests/hour
Pro tenant: 10,000 requests/hour
Enterprise tenant: 100,000 requests/hour
Data Isolation Enforcement
The most critical rule: never trust the application layer alone for data isolation. Use database-level enforcement (Row Level Security in PostgreSQL, separate schemas, or separate databases) as a safety net.
Row Level Security in PostgreSQL
Row Level Security (RLS) is PostgreSQL's built-in mechanism for enforcing data access at the database level. Even if your application code has a bug and forgets to filter by tenant_id, RLS will block the query.
Enabling RLS
-- Enable RLS on the table
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
-- Create a policy: users can only see their tenant's rows
CREATE POLICY tenant_isolation ON orders
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);
Once RLS is enabled, any query on orders that doesn't match the policy returns zero rows — not an error, which matters for security (don't leak existence information).
Setting Tenant Context Per Connection
The challenge is telling PostgreSQL which tenant the current request is for. You set a configuration parameter at the start of each request:
-- Set before executing queries
SET LOCAL app.current_tenant_id = 'tenant_123';
-- All subsequent queries in this transaction are filtered
SELECT * FROM orders; -- Returns only tenant_123's orders
In Node.js with pg, wrap each request in a transaction:
async function withTenantContext<T>(
client: PoolClient,
tenantId: string,
fn: () => Promise<T>
): Promise<T> {
await client.query('BEGIN');
await client.query(
"SELECT set_config('app.current_tenant_id', $1, true)",
[tenantId]
);
try {
const result = await fn();
await client.query('COMMIT');
return result;
} catch (err) {
await client.query('ROLLBACK');
throw err;
}
}
The true third argument to set_config scopes the setting to the transaction — it resets when the transaction ends, preventing cross-request leakage in connection pools.
Performance Considerations
RLS adds overhead because every row evaluation runs the policy expression. For high-traffic tables, this matters. Key optimizations:
Index on tenant_id + primary key. PostgreSQL can use the index to pre-filter rows before applying RLS:
CREATE INDEX CONCURRENTLY orders_tenant_id_idx ON orders(tenant_id);
Avoid expensive functions in policies. current_setting() is fast; avoid JOINs or subqueries in RLS policies. If you need role-based access within a tenant, handle it in the application layer rather than adding complexity to RLS.
Consider partial indexes. If most queries target active records, a partial index can improve both RLS and query performance:
CREATE INDEX orders_active_tenant_idx ON orders(tenant_id, created_at)
WHERE status != 'archived';
The overhead is real but manageable — typically 5-15% for simple policies on well-indexed tables. That's a worthwhile tradeoff for the safety guarantee.
Tenant Resolution in Practice
Tenant resolution should be the first thing your middleware does — before authentication, before rate limiting, before any business logic. If you can't identify the tenant, reject the request.
Express Middleware
import { Request, Response, NextFunction } from 'express';
interface TenantRequest extends Request {
tenantId?: string;
tenant?: Tenant;
}
async function resolveTenant(
req: TenantRequest,
res: Response,
next: NextFunction
) {
let tenantId: string | null = null;
// Strategy 1: Subdomain resolution
const hostname = req.hostname; // acme.api.example.com
const subdomain = hostname.split('.')[0];
if (subdomain && subdomain !== 'api' && subdomain !== 'www') {
const tenant = await db.tenants.findBySlug(subdomain);
tenantId = tenant?.id ?? null;
}
// Strategy 2: JWT claim (fallback)
if (!tenantId && req.auth?.tenantId) {
tenantId = req.auth.tenantId;
}
// Strategy 3: API key lookup
if (!tenantId) {
const apiKey = req.headers['x-api-key'] as string;
if (apiKey) {
const key = await db.apiKeys.findByKey(apiKey);
tenantId = key?.tenantId ?? null;
}
}
if (!tenantId) {
return res.status(400).json({ error: 'tenant_not_found' });
}
req.tenantId = tenantId;
next();
}
Hono Middleware
import { createMiddleware } from 'hono/factory';
const tenantMiddleware = createMiddleware(async (c, next) => {
// Extract from JWT payload (set by auth middleware before this)
const payload = c.get('jwtPayload');
const tenantId = payload?.tenant_id;
if (!tenantId) {
return c.json({ error: 'tenant_required' }, 400);
}
c.set('tenantId', tenantId);
await next();
});
// Use in routes
app.use('/api/*', authMiddleware, tenantMiddleware);
JWT-Based Tenant Extraction
When using JWTs, embed tenant context directly in the token at login time. This avoids a database lookup on every request:
{
"sub": "user_abc",
"tenant_id": "tenant_123",
"plan": "pro",
"iat": 1709900000,
"exp": 1709986400
}
The downside: if a user is moved between tenants or their tenant is suspended, the token remains valid until expiry. Keep JWT expiry short (15-60 minutes) for multi-tenant systems and use refresh token rotation to pick up tenant changes.
Per-Tenant Configuration
Multi-tenant SaaS often needs different behavior per tenant — feature flags, custom branding, plan-specific limits. Keep configuration lightweight and cache aggressively.
Feature Flags Per Tenant
interface TenantConfig {
tenantId: string;
plan: 'free' | 'pro' | 'enterprise';
features: {
advancedAnalytics: boolean;
apiAccess: boolean;
customDomain: boolean;
ssoEnabled: boolean;
webhooks: boolean;
};
limits: {
apiRequestsPerHour: number;
storageGb: number;
maxUsers: number;
};
}
// Cache in Redis with 5-minute TTL
async function getTenantConfig(tenantId: string): Promise<TenantConfig> {
const cached = await redis.get(`tenant:config:${tenantId}`);
if (cached) return JSON.parse(cached);
const config = await db.tenantConfigs.findByTenantId(tenantId);
await redis.setex(`tenant:config:${tenantId}`, 300, JSON.stringify(config));
return config;
}
Custom Domains Per Tenant
Enterprise tenants often want api.theirdomain.com instead of theirdomain.yourproduct.com. This requires DNS configuration (CNAME pointing to your infrastructure) plus mapping incoming hostnames to tenant IDs:
// Wildcard DNS: *.yourproduct.com → your servers
// Custom domain: api.acme-corp.com → your servers (via CNAME)
async function resolveTenantByHostname(hostname: string): Promise<string | null> {
// Check if it's a subdomain of your product domain
if (hostname.endsWith('.yourproduct.com')) {
const slug = hostname.replace('.yourproduct.com', '');
return db.tenants.findBySlug(slug);
}
// Check custom domain mapping (cached)
const tenantId = await redis.get(`custom_domain:${hostname}`);
if (tenantId) return tenantId;
// Database fallback
const tenant = await db.tenants.findByCustomDomain(hostname);
if (tenant) {
await redis.setex(`custom_domain:${hostname}`, 3600, tenant.id);
return tenant.id;
}
return null;
}
Custom domains require TLS certificate provisioning too. Cloudflare for SaaS or Let's Encrypt with cert-manager on Kubernetes are the typical approaches.
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
Missing tenant_id in query | Data leak across tenants | RLS, middleware enforcement |
| Global admin without tenant scope | Access to all tenant data | Explicit tenant context even for admins |
| Shared cache without tenant key | Cache poisoning across tenants | Include tenant_id in all cache keys |
| No per-tenant rate limiting | One tenant impacts others | Tenant-scoped rate limits |
| Shared background jobs | Job failures affect all tenants | Tenant-isolated job queues |
Testing Multi-Tenant APIs
Testing multi-tenancy requires explicit isolation tests. The single most important test: verify that Tenant A cannot read, modify, or delete Tenant B's data.
describe('Tenant Isolation', () => {
let tenantA: Tenant;
let tenantB: Tenant;
let tokenA: string;
let tokenB: string;
beforeAll(async () => {
tenantA = await createTestTenant({ slug: 'tenant-a' });
tenantB = await createTestTenant({ slug: 'tenant-b' });
tokenA = await createTestToken({ tenantId: tenantA.id });
tokenB = await createTestToken({ tenantId: tenantB.id });
// Create a resource owned by tenant B
await db.orders.create({ tenantId: tenantB.id, id: 'order-b-1' });
});
it('tenant A cannot read tenant B orders', async () => {
const res = await request(app)
.get('/api/orders/order-b-1')
.set('Authorization', `Bearer ${tokenA}`);
expect(res.status).toBe(404); // Not 403 — don't leak existence
});
it('tenant A cannot list tenant B orders', async () => {
const res = await request(app)
.get('/api/orders')
.set('Authorization', `Bearer ${tokenA}`);
expect(res.status).toBe(200);
const orderIds = res.body.data.map((o: any) => o.id);
expect(orderIds).not.toContain('order-b-1');
});
it('tenant A cannot update tenant B orders', async () => {
const res = await request(app)
.patch('/api/orders/order-b-1')
.set('Authorization', `Bearer ${tokenA}`)
.send({ status: 'cancelled' });
expect(res.status).toBe(404);
});
});
Return 404 (not 403) when a tenant tries to access another tenant's resource. A 403 confirms the resource exists — information that should be hidden.
Also test the RLS layer directly, bypassing the application:
it('RLS blocks direct database access without tenant context', async () => {
// Without setting tenant context, query should return zero rows
const result = await db.query('SELECT * FROM orders WHERE id = $1', ['order-b-1']);
expect(result.rows).toHaveLength(0);
});
Scaling Multi-Tenant Architectures
The Noisy Neighbor Problem
In shared schema architecture, one tenant running expensive queries slows down all other tenants. This is the #1 scaling complaint for early-stage multi-tenant SaaS. Solutions:
Query timeouts per tenant: Set statement timeouts based on plan tier:
-- Free tier: 5-second query timeout
SET LOCAL statement_timeout = '5s';
-- Enterprise tier: 30-second timeout
SET LOCAL statement_timeout = '30s';
Read replicas for heavy tenants: Route reporting and analytics queries for large tenants to read replicas, keeping the primary free for transactional workloads.
Work queues: Move expensive operations (report generation, bulk exports, complex calculations) to background jobs with per-tenant concurrency limits.
Connection Pooling with PgBouncer
Multi-tenant apps open many short-lived connections — one per request, potentially thousands per second. PostgreSQL can handle ~500 simultaneous connections; beyond that, performance degrades. PgBouncer solves this with connection pooling.
Transaction mode pooling (the right choice for multi-tenant) recycles connections between transactions. This is compatible with SET LOCAL for RLS context because the local setting is scoped to the transaction:
# PgBouncer config
[databases]
myapp = host=localhost port=5432 dbname=myapp
[pgbouncer]
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 25
With 25 PostgreSQL connections, PgBouncer serves 10,000 concurrent application connections. For Prisma users, Prisma Accelerate includes connection pooling built-in, which is the simplest path.
When to Shard by Tenant
Schema-per-tenant and database-per-tenant approaches become necessary when:
- A single tenant generates more than ~20% of your total database load
- Enterprise contracts require data residency in specific regions/clouds
- A tenant's data volume exceeds what's manageable in a shared database (hundreds of millions of rows)
- Compliance requirements mandate physical separation
Sharding by tenant typically happens at the "enterprise tier" of your pricing — it's a cost you can pass through. Start with shared schema, add RLS and per-tenant rate limits early, and migrate individual tenants to dedicated infrastructure when they grow large enough to justify it.
For related patterns, see API rate limiting best practices and building APIs with TypeScript for type-safe implementations of tenant resolution middleware. If you're handling sensitive multi-tenant data, the API security checklist covers additional hardening steps.
Conclusion
Multi-tenancy is fundamentally a data isolation problem dressed up as an architecture problem. The database layer — not the application layer — is where isolation guarantees come from. Start with shared schema and PostgreSQL RLS, enforce tenant context in middleware before anything else runs, and design your testing suite to explicitly verify isolation. Scale to separate schemas or databases only when specific tenants justify the operational overhead.
The common failure mode isn't choosing the wrong isolation strategy — it's trusting the application to filter data correctly without a database-level safety net. One forgotten WHERE tenant_id = clause in a code review is all it takes.
The investment in database-level isolation pays compound dividends: it reduces the worst-case outcome of a code bug from a cross-tenant data breach to a query result error, and it makes your data handling auditable when enterprise security teams review your architecture during procurement. Most enterprise procurement checklists specifically ask whether your data isolation controls are enforced at the database layer or only in application code — the answer matters.