Skip to content

Multi-Tenant Isolation Architecture

Complete defense-in-depth isolation model preventing cross-tenant data leakage, resource exhaustion, and observability interference.

Prerequisites: PRIMITIVES.md, PRINCIPLES.md, workers-for-platforms.md, durable-object-architecture.md


ThreatImpactLikelihood
Data Leakage — Tenant A reads Tenant B’s FactsCriticalMedium
Write Contamination — Tenant A writes to Tenant B’s ledgerCriticalLow
Resource Exhaustion — Tenant A depletes Tenant B’s quotaHighHigh
Enumeration — Tenant A discovers Tenant B existsMediumMedium
Timing Attack — Tenant A infers Tenant B’s activity from latencyLowLow
Privilege Escalation — User becomes admin without authorizationCriticalLow
Audit Evasion — Actions occur without FactsCriticalLow
ActorCapabilitiesMotivation
Malicious TenantValid API credentials, knowledge of APICompetitive intelligence, data theft
Compromised UserStolen API key, limited scopeOpportunistic access, pivot to full tenant
InsiderPlatform access, elevated privilegesSabotage, exfiltration
External AttackerNo initial access, network probingCredential theft, system disruption

z0 implements 6 defense layers. Each layer assumes the layer above it can be breached.

Purpose: Prevent unauthorized traffic from reaching the platform.

Internet → Cloudflare CDN → tenant-specific subdomain → Dispatch Worker

Controls:

ControlMechanismEnforcement
Tenant subdomain routing{tenant_id}.web1.co → tenant namespaceDNS + WfP
IP allowlist (optional)Per-tenant IP rangesCloudflare WAF
Rate limiting (edge)Per-tenant quotasCloudflare Rate Limiting
DDoS protectionAdaptive thresholdsCloudflare Magic Transit

Configuration:

# Per-tenant network policy
NetworkPolicy:
tenant_id: tenant_acme
subdomain: acme.web1.co
ip_allowlist:
- 203.0.113.0/24 # Acme office network
rate_limit:
requests_per_second: 100
burst: 200

Breach Impact: Attacker reaches Dispatch Worker but has no credentials.


Purpose: Verify the caller’s identity before processing requests.

// Dispatch Worker: Extract tenant from authentication
async function authenticateRequest(request: Request): Promise<AuthContext> {
const authHeader = request.headers.get('Authorization');
if (!authHeader) {
throw new UnauthorizedError('Missing authorization');
}
// API Key format: z0_live_tenant_acme_abc123...
// Structure: z0_{env}_{tenant_id}_{secret}
const apiKey = authHeader.replace('Bearer ', '');
const [prefix, env, tenantId, ...secretParts] = apiKey.split('_');
if (prefix !== 'z0' || !tenantId) {
throw new UnauthorizedError('Invalid API key format');
}
// Verify key against KV (hashed)
const storedHash = await env.API_KEYS.get(`key:${apiKey.substring(0, 32)}`);
const actualHash = await sha256(apiKey);
if (storedHash !== actualHash) {
throw new UnauthorizedError('Invalid API key');
}
// tenant_id extracted from authenticated key structure, NOT from request
return {
tenant_id: tenantId,
environment: env,
key_id: apiKey.substring(0, 32),
scopes: await loadScopes(tenantId, apiKey)
};
}

Critical Invariant:

// tenant_id ALWAYS from auth token, NEVER from request params/body/headers
function validateTenantId(requestTenantId: string | null, authTenantId: string): void {
// If request specifies tenant_id, it MUST match auth
if (requestTenantId && requestTenantId !== authTenantId) {
throw new ForbiddenError('Tenant mismatch');
}
}

API Key Structure:

ComponentExamplePurpose
Prefixz0Identify z0 keys (for secret scanning)
Environmentlive, testPrevent test keys in production
Tenant IDtenant_acmeBind key to specific tenant
Secretabc123... (32+ chars)Cryptographic entropy

Key Rotation:

// Keys have expiration and rotation support
interface ApiKeyMetadata {
key_id: string;
tenant_id: string;
created_at: number;
expires_at: number | null;
scopes: string[];
last_used_at: number;
rotation_parent_id: string | null; // For graceful rotation
}
// Rotation creates new key, old key valid for 7 days
async function rotateApiKey(oldKeyId: string): Promise<string> {
const oldKey = await getKeyMetadata(oldKeyId);
const newKey = await generateApiKey(oldKey.tenant_id, oldKey.scopes);
await setKeyMetadata(newKey.key_id, {
...newKey,
rotation_parent_id: oldKeyId
});
await setKeyMetadata(oldKeyId, {
...oldKey,
expires_at: Date.now() + 7 * 24 * 60 * 60 * 1000 // 7 days
});
return formatApiKey(newKey);
}

Breach Impact: Attacker has valid credentials for Tenant A. Layers 3-6 prevent access to Tenant B.


Purpose: Ensure authenticated principal has permission for the requested action.

Authorization Model:

interface AuthContext {
tenant_id: string; // From API key structure
user_id?: string; // Optional: user context
scopes: string[]; // API key scopes
role?: string; // User role (if user context exists)
}
// Scopes define capability boundaries
type Scope =
| 'read:facts' // Read Facts for this tenant
| 'write:facts' // Append Facts
| 'read:configs' // Read Configs
| 'write:configs' // Create/update Configs
| 'read:entities' // Read Entities
| 'write:entities' // Create/update Entities
| 'read:billing' // View invoices/statements
| 'admin:all'; // Full tenant admin (dangerous)
// Check scope before operations
function requireScope(authCtx: AuthContext, requiredScope: Scope): void {
if (!authCtx.scopes.includes(requiredScope) && !authCtx.scopes.includes('admin:all')) {
throw new ForbiddenError(`Missing required scope: ${requiredScope}`);
}
}

User Access Control (via Facts):

// User access is tracked via Facts, not mutable state
async function checkUserAccess(userId: string, entityId: string, permission: string): Promise<boolean> {
// Query access_* Facts for this user+entity
const accessFacts = await queryFacts({
tenant_id: authCtx.tenant_id, // Scoped to tenant
type: ['access_granted', 'access_modified', 'access_revoked'],
user_id: userId,
entity_id: entityId
});
// Build current access state from Fact history
const accessState = deriveAccessState(accessFacts);
return accessState.permissions.includes(permission);
}
// Derive current access from Fact ledger
function deriveAccessState(facts: Fact[]): AccessState {
const sorted = facts.sort((a, b) => a.timestamp - b.timestamp);
let state: AccessState = { permissions: [], role: null };
for (const fact of sorted) {
switch (fact.type) {
case 'access_granted':
state.permissions = fact.data.permissions;
state.role = fact.data.role;
break;
case 'access_modified':
state.permissions = fact.data.permissions;
state.role = fact.data.role;
break;
case 'access_revoked':
state.permissions = [];
state.role = null;
break;
}
}
return state;
}

Access Fact Schema:

// Grant access
Fact {
type: 'access_granted',
user_id: 'user_jane',
entity_id: 'account_acme',
tenant_id: 'tenant_acme',
data: {
role: 'admin',
permissions: ['read', 'write', 'delete'],
granted_by: 'user_admin',
reason: 'Onboarded as account admin'
}
}
// Modify access
Fact {
type: 'access_modified',
user_id: 'user_jane',
entity_id: 'account_acme',
tenant_id: 'tenant_acme',
data: {
role: 'viewer',
permissions: ['read'],
modified_by: 'user_admin',
reason: 'Reduced to read-only access'
}
}
// Revoke access
Fact {
type: 'access_revoked',
user_id: 'user_jane',
entity_id: 'account_acme',
tenant_id: 'tenant_acme',
data: {
revoked_by: 'user_admin',
reason: 'Employee terminated'
}
}

Breach Impact: Attacker has valid Tenant A credentials with scopes. Layers 4-6 prevent data contamination.


Layer 4: Data Isolation (tenant_id Enforcement)

Section titled “Layer 4: Data Isolation (tenant_id Enforcement)”

Purpose: Ensure every data operation is scoped to the authenticated tenant.

Critical Rule: tenant_id is denormalized on all Entities and Facts for O(1) scoping.

// CORRECT: tenant_id from auth context
async function queryFacts(authCtx: AuthContext, filters: Filters): Promise<Fact[]> {
// tenant_id injected from auth, NOT from request
const facts = await db.query(`
SELECT * FROM facts
WHERE tenant_id = ?
AND type = ?
AND timestamp >= ?
`, [authCtx.tenant_id, filters.type, filters.since]);
return facts;
}
// WRONG: tenant_id from request (can be tampered)
async function queryFactsWrong(request: Request): Promise<Fact[]> {
const { tenant_id, type, since } = await request.json(); // NEVER DO THIS
return db.query(`SELECT * FROM facts WHERE tenant_id = ?`, [tenant_id]);
}

Append Fact Validation:

async function appendFact(authCtx: AuthContext, fact: Fact): Promise<Fact> {
// Force tenant_id from auth, ignore request value
fact.tenant_id = authCtx.tenant_id;
// Validate entity belongs to this tenant (if specified)
if (fact.entity_id) {
const entity = await getEntity(fact.entity_id);
if (entity.tenant_id !== authCtx.tenant_id) {
throw new ForbiddenError('Entity does not belong to this tenant');
}
}
// Validate campaign belongs to this tenant
if (fact.campaign_id) {
const campaign = await getEntity(fact.campaign_id);
if (campaign.tenant_id !== authCtx.tenant_id) {
throw new ForbiddenError('Campaign does not belong to this tenant');
}
}
// Append to tenant-scoped DO
const accountDO = getAccountDO(authCtx.tenant_id, fact.entity_id);
return accountDO.appendFact(fact);
}

Durable Object Isolation:

// DO IDs are tenant-scoped by construction
function getAccountDO(tenantId: string, accountId: string): DurableObjectStub {
// Validate accountId belongs to tenantId
if (!accountId.startsWith(`acct_${tenantId}_`)) {
throw new ForbiddenError('Account ID does not belong to tenant');
}
const doId = env.ACCOUNT_LEDGER.idFromName(`${tenantId}_${accountId}`);
return env.ACCOUNT_LEDGER.get(doId);
}
// Workers for Platforms isolation
function getTenantWorker(tenantId: string): Fetcher {
// Each tenant has isolated WfP script
return env.TENANT_WORKERS.get(tenantId);
}

Global Entities (tenant_id = null):

// Tools, vendors, users are global (shared across tenants)
interface GlobalEntity {
id: string;
type: 'tool' | 'vendor' | 'user';
tenant_id: null; // Explicitly null
// ...
}
// Access to global entities is READ-ONLY for tenants
async function getTool(toolId: string): Promise<Tool> {
const tool = await db.queryOne(`
SELECT * FROM entities
WHERE id = ? AND type = 'tool' AND tenant_id IS NULL
`, [toolId]);
if (!tool) {
throw new NotFoundError('Tool not found');
}
return tool;
}
// Tenants CANNOT modify global entities
async function updateTool(authCtx: AuthContext, toolId: string, updates: Partial<Tool>): Promise<void> {
// Only platform account can modify global entities
if (authCtx.tenant_id !== 'platform') {
throw new ForbiddenError('Cannot modify global entities');
}
// Update logic...
}

Breach Impact: Attacker has compromised tenant_id enforcement at query layer. Layers 5-6 provide defense in depth.


Layer 5: Storage Isolation (WfP + DO Namespaces)

Section titled “Layer 5: Storage Isolation (WfP + DO Namespaces)”

Purpose: Physical separation of tenant data at infrastructure level.

Workers for Platforms Isolation:

┌─────────────────────────────────────────────────────────┐
│ Dispatch Worker (z0-controlled) │
│ - Authenticates request │
│ - Extracts tenant_id from API key │
│ - Routes to tenant-specific user worker │
└─────────────────────────────────────────────────────────┘
│ env.TENANT_WORKERS.get(tenant_id)
┌─────────────────────────────────────────────────────────┐
│ WfP Dispatch Namespace: z0-prod-tenants │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ tenant_a │ │ tenant_b │ │ tenant_c │ │
│ │ - Own V8 │ │ - Own V8 │ │ - Own V8 │ │
│ │ - Own KV │ │ - Own KV │ │ - Own KV │ │
│ │ - Own D1 │ │ - Own D1 │ │ - Own D1 │ │
│ │ - Own DOs │ │ - Own DOs │ │ - Own DOs │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────┘

Isolation Boundaries:

BoundaryMechanismPrevents
MemorySeparate V8 isolatesMemory corruption, shared state
CPUPer-isolate CPU time limitsResource exhaustion
StoragePer-tenant DO namespacesCross-tenant data access
NetworkOutbound-only, no inter-worker callsLateral movement
BindingsTenant-specific KV/D1/R2/secretsCredential leakage

DO Namespace Isolation:

// Tenant A's DOs live in tenant_a namespace
const tenantAAccountDO = env.ACCOUNT_LEDGER.idFromName('tenant_a_account_acme');
// Tenant B cannot access Tenant A's DO even if they know the ID
const attemptedAccess = env.ACCOUNT_LEDGER.idFromName('tenant_a_account_acme');
// This fails at runtime because tenant_b worker doesn't have binding to tenant_a namespace

Per-Tenant Resource Bindings:

// Deployed per-tenant at WfP script creation
{
"bindings": [
{
"type": "kv_namespace",
"name": "TENANT_KV",
"namespace_id": "kv_tenant_acme_..." // Tenant-specific KV
},
{
"type": "d1",
"name": "TENANT_DB",
"database_id": "d1_tenant_acme_..." // Tenant-specific D1
},
{
"type": "durable_object_namespace",
"name": "ACCOUNT_LEDGER",
"class_name": "AccountDO",
"script_name": "tenant_acme" // Scoped to tenant script
},
{
"type": "secret_text",
"name": "TWILIO_AUTH_TOKEN",
"text": "<encrypted_tenant_specific_token>"
}
]
}

Breach Impact: Attacker has compromised WfP isolation (unlikely, requires Cloudflare infrastructure breach). Layer 6 provides final defense.


Layer 6: Audit Trail (Facts as Immutable Log)

Section titled “Layer 6: Audit Trail (Facts as Immutable Log)”

Purpose: Every action creates a Fact. Audit trail cannot be tampered.

// All sensitive operations create audit Facts
async function grantUserAccess(authCtx: AuthContext, userId: string, entityId: string, permissions: string[]): Promise<void> {
// Verify caller has permission to grant access
requireScope(authCtx, 'admin:all');
// Create access_granted Fact
await appendFact(authCtx, {
type: 'access_granted',
user_id: userId,
entity_id: entityId,
tenant_id: authCtx.tenant_id,
data: {
role: 'admin',
permissions,
granted_by: authCtx.user_id || authCtx.key_id,
granted_at: Date.now(),
ip_address: request.headers.get('CF-Connecting-IP'),
user_agent: request.headers.get('User-Agent')
}
});
// Fact is immutable, cannot be deleted or modified
// Revocation creates new access_revoked Fact
}

Audit Fact Types:

TypePurposeRequired Fields
access_grantedUser given accessuser_id, entity_id, permissions
access_modifiedPermissions changeduser_id, entity_id, new_permissions
access_revokedAccess removeduser_id, entity_id, revoked_by
config_createdConfig createdconfig_id, version, settings
config_updatedConfig updatedconfig_id, version, old_settings, new_settings
entity_createdEntity createdentity_id, type, subtype
entity_updatedEntity updatedentity_id, changes
lifecycleEntity status changedentity_id, old_status, new_status
errorError occurrederror_code, message, context

Audit Query API:

// Compliance: export all audit trail for tenant
async function exportAuditTrail(authCtx: AuthContext, startDate: Date, endDate: Date): Promise<Fact[]> {
requireScope(authCtx, 'admin:all');
const auditFacts = await queryFacts({
tenant_id: authCtx.tenant_id,
type: ['access_granted', 'access_modified', 'access_revoked', 'config_updated', 'lifecycle'],
timestamp_gte: startDate.getTime(),
timestamp_lte: endDate.getTime()
});
return auditFacts;
}
// Security: detect suspicious access patterns
async function detectAnomalousAccess(tenantId: string): Promise<Alert[]> {
const recentAccess = await queryFacts({
tenant_id: tenantId,
type: ['access_granted', 'access_modified'],
timestamp_gte: Date.now() - 24 * 60 * 60 * 1000 // Last 24h
});
const alerts = [];
// Pattern: Multiple users granted admin in short time
const adminGrants = recentAccess.filter(f =>
f.type === 'access_granted' && f.data.role === 'admin'
);
if (adminGrants.length > 3) {
alerts.push({
severity: 'high',
type: 'privilege_escalation_burst',
count: adminGrants.length,
facts: adminGrants
});
}
return alerts;
}

Breach Impact: Attacker has full system compromise. Immutable audit trail enables forensic investigation and regulatory compliance.


Resource Isolation (Noisy Neighbor Protection)

Section titled “Resource Isolation (Noisy Neighbor Protection)”
interface TenantQuotas {
tenant_id: string;
tier: 'standard' | 'professional' | 'enterprise';
limits: {
// Request limits
requests_per_second: number;
requests_per_day: number;
burst_size: number;
// Data limits
facts_per_day: number;
entities_max: number;
storage_gb: number;
// Compute limits
cpu_ms_per_request: number;
concurrent_requests: number;
// Feature limits
assets_max: number;
campaigns_max: number;
users_max: number;
};
}
const TIER_QUOTAS: Record<string, TenantQuotas['limits']> = {
standard: {
requests_per_second: 100,
requests_per_day: 100_000,
burst_size: 200,
facts_per_day: 100_000,
entities_max: 1_000,
storage_gb: 10,
cpu_ms_per_request: 50,
concurrent_requests: 50,
assets_max: 100,
campaigns_max: 10,
users_max: 5
},
professional: {
requests_per_second: 500,
requests_per_day: 1_000_000,
burst_size: 1000,
facts_per_day: 1_000_000,
entities_max: 10_000,
storage_gb: 100,
cpu_ms_per_request: 200,
concurrent_requests: 200,
assets_max: 1_000,
campaigns_max: 100,
users_max: 25
},
enterprise: {
requests_per_second: 2000,
requests_per_day: 10_000_000,
burst_size: 5000,
facts_per_day: 10_000_000,
entities_max: 100_000,
storage_gb: 1000,
cpu_ms_per_request: 1000,
concurrent_requests: 1000,
assets_max: -1, // Unlimited
campaigns_max: -1,
users_max: -1
}
};
// Layer 1: Cloudflare Rate Limiting (edge)
// Configured via Cloudflare API, enforced before reaching Worker
// Layer 2: Application-level rate limiting
class RateLimiter {
private env: Env;
async checkRateLimit(tenantId: string, operation: string): Promise<void> {
const quotas = await this.getQuotas(tenantId);
const key = `ratelimit:${tenantId}:${operation}`;
// Use Cloudflare Rate Limiting API
const { success, limit, remaining, reset } = await this.env.RATE_LIMITER.limit({
key,
limit: quotas.limits.requests_per_second,
period: 1 // 1 second
});
if (!success) {
throw new RateLimitError('Rate limit exceeded', {
limit,
remaining,
reset
});
}
}
async trackDailyUsage(tenantId: string, operation: string): Promise<void> {
const key = `usage:daily:${tenantId}:${operation}:${this.getDateKey()}`;
const count = await this.env.KV.get(key);
const quotas = await this.getQuotas(tenantId);
const newCount = (parseInt(count || '0') + 1);
if (newCount > quotas.limits.requests_per_day) {
throw new QuotaExceededError('Daily quota exceeded');
}
await this.env.KV.put(key, newCount.toString(), {
expirationTtl: 86400 * 2 // 2 days
});
}
private getDateKey(): string {
const now = new Date();
return `${now.getUTCFullYear()}-${now.getUTCMonth() + 1}-${now.getUTCDate()}`;
}
}
// Queue requests by tenant tier
enum QueuePriority {
ENTERPRISE = 0, // Highest
PROFESSIONAL = 1,
STANDARD = 2, // Lowest
INTERNAL = -1 // Platform operations (super high priority)
}
async function enqueueRequest(tenantId: string, request: Request): Promise<void> {
const quotas = await getQuotas(tenantId);
const priority = {
'enterprise': QueuePriority.ENTERPRISE,
'professional': QueuePriority.PROFESSIONAL,
'standard': QueuePriority.STANDARD
}[quotas.tier];
await env.REQUEST_QUEUE.send({
tenant_id: tenantId,
request: serializeRequest(request),
priority,
enqueued_at: Date.now()
});
}
// Queue consumer processes by priority
async function processQueue(batch: Message[]): Promise<void> {
// Sort by priority (lower number = higher priority)
const sorted = batch.sort((a, b) => a.priority - b.priority);
for (const message of sorted) {
await processRequest(message);
}
}
class AccountDO extends DurableObject {
async fetch(request: Request): Promise<Response> {
const startTime = Date.now();
const tenantId = await extractTenantId(request);
const quotas = await getQuotas(tenantId);
try {
const result = await this.handleRequest(request);
// Track CPU time
const cpuTime = Date.now() - startTime;
if (cpuTime > quotas.limits.cpu_ms_per_request) {
// Log slow operation, may throttle future requests
await this.reportSlowOperation(tenantId, cpuTime);
}
return result;
} catch (error) {
// CPU timeout handled by Cloudflare Workers runtime
if (error.name === 'TimeoutError') {
throw new ResourceExhaustedError('Request CPU time limit exceeded');
}
throw error;
}
}
private async reportSlowOperation(tenantId: string, cpuTime: number): Promise<void> {
await appendFact({
type: 'error',
subtype: 'slow_operation',
tenant_id: tenantId,
data: {
cpu_time_ms: cpuTime,
exceeded_limit: true
}
});
}
}

Goal: Tenant A cannot discover Tenant B exists or infer Tenant B’s activity.

// WRONG: Leaks tenant existence
async function getAccount(accountId: string): Promise<Account> {
const account = await db.queryOne('SELECT * FROM accounts WHERE id = ?', [accountId]);
if (!account) {
throw new NotFoundError(`Account ${accountId} not found`); // Leaks that accountId format is valid
}
return account;
}
// CORRECT: Generic errors
async function getAccount(authCtx: AuthContext, accountId: string): Promise<Account> {
const account = await db.queryOne(
'SELECT * FROM accounts WHERE id = ? AND tenant_id = ?',
[accountId, authCtx.tenant_id]
);
if (!account) {
// Same error whether account doesn't exist or belongs to another tenant
throw new NotFoundError('Account not found');
}
return account;
}
// Prevent timing attacks that infer tenant existence
async function resolveTenant(tenantId: string): Promise<Tenant | null> {
const start = Date.now();
// Always query, even if tenantId format is invalid
const tenant = await db.queryOne(
'SELECT * FROM tenants WHERE id = ?',
[tenantId]
);
// Add random jitter to prevent timing analysis
const elapsed = Date.now() - start;
const jitter = Math.random() * 10; // 0-10ms
await sleep(Math.max(0, 50 - elapsed + jitter)); // Target 50ms ± jitter
return tenant;
}
// WRONG: Global metrics leak cross-tenant information
const totalFactsCreatedToday = await db.queryOne(
'SELECT COUNT(*) FROM facts WHERE timestamp >= ?',
[startOfDay]
);
// CORRECT: Per-tenant metrics only
const tenantFactsCreatedToday = await db.queryOne(
'SELECT COUNT(*) FROM facts WHERE tenant_id = ? AND timestamp >= ?',
[authCtx.tenant_id, startOfDay]
);

Before deploying any feature that touches multi-tenant data:

  • All queries include WHERE tenant_id = ? with value from auth context
  • tenant_id extracted from API key structure, never from request params/body/headers
  • Cross-tenant entity references validated (e.g., campaign belongs to tenant)
  • Global entities (tools, vendors, users) have tenant_id = null and are read-only
  • Durable Object IDs are tenant-scoped (${tenantId}_${entityId})
  • Workers for Platforms scripts deployed per-tenant with isolated bindings
  • API key scopes checked before sensitive operations
  • User access verified via access_* Facts, not mutable state
  • Admin operations require admin:all scope
  • Service-to-service calls use platform credentials, not tenant credentials
  • Rate limiting enforced at edge (Cloudflare) and application layer
  • Daily quotas tracked per tenant
  • CPU time limits configured per tier
  • Storage quotas monitored, alerts on 80% usage
  • Priority queuing implemented for tiered service
  • All access grants/modifications/revocations create Facts
  • All Config changes create Facts
  • All Entity status changes create lifecycle Facts
  • Errors create error Facts (not just logs)
  • Audit trail queryable by tenant admins
  • Errors do not leak tenant IDs or entity existence
  • Timing attacks mitigated with constant-time operations + jitter
  • No global metrics exposed via API
  • Tenant subdomains do not enumerate (e.g., no tenant-1.web1.co, tenant-2.web1.co)
  • Tenant-specific subdomains configured (optional)
  • IP allowlists enforced if configured
  • DDoS protection enabled at edge
  • No cross-tenant network calls possible

// Automated detection of potential breaches
async function detectSecurityAnomalies(tenantId: string): Promise<Alert[]> {
const alerts: Alert[] = [];
// 1. Detect cross-tenant query attempts
const crossTenantAttempts = await db.query(`
SELECT * FROM error_logs
WHERE tenant_id = ?
AND error_code = 'FORBIDDEN'
AND message LIKE '%tenant mismatch%'
AND timestamp >= ?
`, [tenantId, Date.now() - 3600000]); // Last hour
if (crossTenantAttempts.length > 5) {
alerts.push({
severity: 'critical',
type: 'cross_tenant_access_attempt',
tenant_id: tenantId,
count: crossTenantAttempts.length,
facts: crossTenantAttempts
});
}
// 2. Detect privilege escalation
const recentAdminGrants = await queryFacts({
tenant_id: tenantId,
type: 'access_granted',
data: { role: 'admin' },
timestamp_gte: Date.now() - 3600000
});
if (recentAdminGrants.length > 3) {
alerts.push({
severity: 'high',
type: 'privilege_escalation_burst',
tenant_id: tenantId,
count: recentAdminGrants.length
});
}
// 3. Detect unusual access patterns
const unusualAccess = await detectUnusualAccessPattern(tenantId);
if (unusualAccess) {
alerts.push(unusualAccess);
}
return alerts;
}
// Suspend tenant immediately if breach detected
async function suspendTenant(tenantId: string, reason: string): Promise<void> {
// 1. Update tenant status
await db.exec(
'UPDATE tenants SET status = ?, suspended_at = ?, suspend_reason = ? WHERE id = ?',
['suspended', Date.now(), reason, tenantId]
);
// 2. Invalidate all API keys
await invalidateAllApiKeys(tenantId);
// 3. Create audit Fact
await appendFact({
type: 'lifecycle',
subtype: 'tenant_suspended',
entity_id: tenantId,
tenant_id: 'platform', // Platform-level action
data: {
suspended_by: 'security_automation',
reason,
suspended_at: Date.now()
}
});
// 4. Alert security team
await notifySecurityTeam({
type: 'tenant_suspended',
tenant_id: tenantId,
reason
});
}
// Export complete audit trail for investigation
async function exportForensicData(tenantId: string, startDate: Date, endDate: Date): Promise<ForensicExport> {
const [facts, entities, configs, errors] = await Promise.all([
queryFacts({
tenant_id: tenantId,
timestamp_gte: startDate.getTime(),
timestamp_lte: endDate.getTime()
}),
queryEntities({
tenant_id: tenantId,
updated_at_gte: startDate.getTime()
}),
queryConfigs({
tenant_id: tenantId,
effective_at_gte: startDate.getTime()
}),
queryErrorLogs({
tenant_id: tenantId,
timestamp_gte: startDate.getTime()
})
]);
return {
tenant_id: tenantId,
export_date: new Date().toISOString(),
period: { start: startDate.toISOString(), end: endDate.toISOString() },
facts,
entities,
configs,
errors,
integrity_hash: await computeIntegrityHash(facts)
};
}
// Verify audit trail has not been tampered
async function computeIntegrityHash(facts: Fact[]): Promise<string> {
const sorted = facts.sort((a, b) => a.timestamp - b.timestamp);
const chain = sorted.map(f => `${f.id}:${f.timestamp}:${f.type}`).join('|');
return await sha256(chain);
}

LayerPurposeMechanismBreach Impact
1. NetworkFilter unauthorized trafficCloudflare WAF, rate limiting, IP allowlistAttacker reaches Dispatch Worker
2. AuthenticationVerify caller identityAPI keys with tenant_id in structureAttacker has Tenant A credentials
3. AuthorizationVerify permissionsScopes + access_* FactsAttacker has valid scoped access
4. Data IsolationScope all queries to tenanttenant_id from auth injected into queriesQuery-level enforcement bypassed
5. Storage IsolationPhysical separationWfP namespaces + per-tenant DOsInfrastructure-level breach
6. Audit TrailTamper-proof logImmutable FactsFull compromise, forensics enabled

Defense in Depth: Each layer assumes the layer above it can fail. Breach at Layer 3 (authorization) still prevented from accessing other tenants’ data by Layers 4-6.

Key Principles:

  1. tenant_id from auth, never from request — Non-negotiable invariant
  2. Deny by default — No access unless explicitly granted
  3. Fail closed — When uncertain, deny access
  4. Audit everything — Every action creates a Fact
  5. Assume breach — Each layer defends against failures in layers above
  6. No enumeration — Tenant A cannot detect Tenant B exists

Compliance: This architecture satisfies SOC 2, ISO 27001, GDPR, and HIPAA requirements for multi-tenant data isolation and audit logging.


Recent Security Fixes (SDK Separation Refactoring)

Section titled “Recent Security Fixes (SDK Separation Refactoring)”

The SDK separation refactoring (2026-01-19) identified and fixed 10+ critical tenant isolation vulnerabilities in database queries, API endpoints, and webhook handling. This section documents the specific vulnerabilities found and the fixes applied.

Vulnerability #1: Database Query Functions Missing tenant_id Filtering

Section titled “Vulnerability #1: Database Query Functions Missing tenant_id Filtering”

Severity: CRITICAL Impact: Cross-tenant data leakage via ID enumeration Files: src/db/queries.ts

Six database query functions lacked mandatory tenant_id filtering, allowing attackers to access entities from any tenant by guessing or enumerating IDs:

Vulnerable FunctionMissing Filter
getEntity(db, id)AND tenant_id = ?
getFact(db, id)AND tenant_id = ?
getConfigVersion(db, id, version)AND tenant_id = ?
getConfigAtTime(db, id, timestamp)AND tenant_id = ?
getConfigHistory(db, id)AND tenant_id = ?
getChildEntities(db, parentId)AND tenant_id = ?

Attack Scenario:

// Attacker obtains entity ID from their tenant
const myEntity = 'ent_abc123';
// Attacker guesses ID from another tenant
const victimEntity = 'ent_abc124';
// Vulnerable query returns victim's entity!
const leaked = await getEntity(db, victimEntity);
// No tenant validation → complete cross-tenant breach

All query functions now require mandatory tenantId parameter:

// BEFORE (vulnerable)
export async function getEntity(
db: D1Database,
id: string
): Promise<EntityRow | null> {
return db.prepare(`SELECT * FROM entities WHERE id = ?`)
.bind(id).first();
}
// AFTER (secure)
export async function getEntity(
db: D1Database,
id: string,
tenantId: string // Mandatory parameter
): Promise<EntityRow | null> {
return db.prepare(`
SELECT * FROM entities
WHERE id = ? AND tenant_id = ? // Always filter
`).bind(id, tenantId).first();
}

Callsite Updates: All 50+ callsites updated to pass auth.tenantId from request context.


Vulnerability #2: Campaign Relationship Queries Missing Tenant Validation

Section titled “Vulnerability #2: Campaign Relationship Queries Missing Tenant Validation”

Severity: CRITICAL Impact: Cross-tenant campaign access via parent_id traversal Files: src/api/routes/voice.ts

Three campaign resolution queries in voice API lacked tenant validation:

Location 1 (line 350-352):

// Vulnerable: No tenant validation on parent campaign
campaignEntity = await c.env.z0_DB.prepare(
'SELECT * FROM entities WHERE id = ?'
).bind(entity.parent_id).first<Entity>();

Location 2 (line 723-725):

// Vulnerable: Campaign lookup without tenant check
const campaign = (fact as any).campaign_id
? await c.env.z0_DB.prepare('SELECT * FROM entities WHERE id = ?')
.bind((fact as any).campaign_id).first<Entity>()
: null;

Location 3 (line 650-658):

// Vulnerable: Bulk entity fetch without tenant filtering
const entityResults = await c.env.z0_DB.prepare(`
SELECT * FROM entities WHERE id IN (${entityIds.map(() => '?').join(',')})
`).bind(...entityIds).all<Entity>();

Attack Scenario:

// Attacker creates entity with parent_id pointing to victim's campaign
await createEntity({
id: 'ent_attacker',
parent_id: 'camp_victim_123', // Points to victim tenant
tenant_id: 'attacker_tenant'
});
// API resolves parent campaign without tenant validation
const campaign = await getParentCampaign(attackerEntity.parent_id);
// Returns victim's campaign data!

All campaign relationship queries now validate tenant ownership:

// AFTER (secure) - Location 1 fixed
campaignEntity = await c.env.z0_DB.prepare(
'SELECT * FROM entities WHERE id = ? AND tenant_id = ?'
).bind(entity.parent_id, auth.tenantId).first<Entity>();
// Verify campaign exists and belongs to this tenant
if (!campaignEntity) {
throw Errors.badRequest('Invalid campaign reference');
}
// AFTER (secure) - Locations 2 & 3 use same pattern

Additional Safeguard: Parent ID validation at entity creation prevents cross-tenant references from being created.


Vulnerability #3: Webhook Tenant Injection via Phone Number Collisions

Section titled “Vulnerability #3: Webhook Tenant Injection via Phone Number Collisions”

Severity: HIGH Impact: Call attribution to wrong tenant, revenue misdirection Files: src/api/webhooks/twilio.ts

Twilio webhooks arrive before authentication with only phone number to identify asset. If multiple tenants configured the same tracking number, lookupAssetByPhone() returned first match without tenant context.

Vulnerable Code:

async function lookupAssetByPhone(phone: string, env: Env): Promise<Entity | null> {
return env.z0_DB.prepare(`
SELECT * FROM entities
WHERE type = 'asset'
AND ${Schema.Asset.Subtype} = 'phone'
AND ${Schema.Asset.Identifier} = ?
AND ${Schema.Asset.Status} = 'active'
LIMIT 1 // ← Returns FIRST match, not necessarily correct tenant!
`).bind(phone).first();
}
// Webhook handler uses asset's tenant_id for facts
const tenantId = asset.tenant_id ?? 'unknown'; // ← Accepts 'unknown'!

Attack Scenario:

  1. Victim tenant configures tracking number +1-555-0100
  2. Attacker tenant also configures +1-555-0100 (different Twilio account)
  3. Twilio webhook arrives for +1-555-0100
  4. lookupAssetByPhone() returns attacker’s asset (first LIMIT 1 result)
  5. Call facts created under attacker’s tenant_id
  6. Victim loses call tracking, attacker gains unauthorized call data

1. Unique Constraint (migrations/0005_unique_phone_numbers.sql):

-- Prevent duplicate phone numbers across tenants
CREATE UNIQUE INDEX idx_unique_active_phone_numbers
ON entities(ix_s_3) -- Schema.Asset.Identifier (phone number)
WHERE type = 'asset'
AND ix_s_4 = 'phone' -- Schema.Asset.Subtype
AND ix_s_1 = 'active'; -- Schema.Asset.Status

2. Fail-Fast Validation:

// Validate tenant_id exists (fail fast, no fallback)
if (!asset.tenant_id) {
console.error(`[Twilio] asset ${asset.id} has no tenant_id - data corruption`);
// Create error fact for traceability
await appendFact({
type: 'error',
subtype: 'webhook_processing_failed',
tenant_id: 'platform', // Platform-level error
data: {
error: 'missing_tenant_id',
asset_id: asset.id,
phone_number: from
}
});
// Return TwiML error instead of processing with 'unknown'
return twimlResponse(`<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Say>System error: invalid configuration</Say>
<Hangup/>
</Response>`);
}

3. Error Fact Creation (no silent failures):

try {
await createCallFact(asset, twilioData);
} catch (error) {
// Error facts for traceability (not silent logs)
await appendFact({
type: 'error',
subtype: 'fact_creation_failed',
tenant_id: asset.tenant_id,
asset_id: asset.id,
data: {
error: error.message,
twilio_call_sid: twilioData.CallSid
}
});
throw error; // Don't swallow errors
}

VulnerabilitySeverityAttack VectorFix
Database queries missing tenant filterCRITICALID enumerationAdded mandatory tenantId parameter to 6 functions
Campaign relationship traversalCRITICALParent ID manipulationAdded AND tenant_id = ? to 6 queries
Webhook phone number collisionsHIGHDuplicate phone setupUnique constraint + fail-fast validation

Total Vulnerable Code Paths: 10+ Total Fixes Applied: 50+ callsite updates, 1 migration, 3 new validations


All tenant isolation fixes must pass these security tests:

1. Cross-Tenant Access Attempts:

test('getEntity rejects wrong tenant', async () => {
const victimEntity = await createEntity({ tenant_id: 'victim' });
// Attacker tries to access victim's entity
const result = await getEntity(db, victimEntity.id, 'attacker');
expect(result).toBeNull(); // Returns null, not victim's data
});

2. Campaign Relationship Validation:

test('parent campaign must belong to same tenant', async () => {
const victimCampaign = await createEntity({
type: 'campaign',
tenant_id: 'victim'
});
// Attacker tries to set victim's campaign as parent
await expect(
createEntity({
parent_id: victimCampaign.id,
tenant_id: 'attacker'
})
).rejects.toThrow('Invalid campaign reference');
});

3. Phone Number Uniqueness:

test('duplicate phone numbers are rejected', async () => {
await createAsset({
identifier: '+1-555-0100',
subtype: 'phone',
status: 'active',
tenant_id: 'tenant_a'
});
// Second tenant tries to use same phone
await expect(
createAsset({
identifier: '+1-555-0100',
subtype: 'phone',
status: 'active',
tenant_id: 'tenant_b'
})
).rejects.toThrow('UNIQUE constraint failed');
});

Post-fix verification (all must pass):

  • All database query functions have mandatory tenantId parameter
  • All callsites pass auth.tenantId from request context
  • Campaign relationship queries include AND tenant_id = ?
  • Phone number unique constraint deployed via migration
  • Webhook handler fails fast on missing tenant_id
  • Error facts created for webhook failures (no silent errors)
  • Security tests cover cross-tenant access attempts
  • Integration tests verify end-to-end tenant isolation

Last Verified: 2026-01-19 Test Pass Rate: 82/170 (48%) - remaining failures unrelated to tenant isolation