Multi-Tenant Isolation Architecture
Complete defense-in-depth isolation model preventing cross-tenant data leakage, resource exhaustion, and observability interference.
Prerequisites: PRIMITIVES.md, PRINCIPLES.md, workers-for-platforms.md, durable-object-architecture.md
Threat Model
Section titled “Threat Model”Attack Scenarios
Section titled “Attack Scenarios”| Threat | Impact | Likelihood |
|---|---|---|
| Data Leakage — Tenant A reads Tenant B’s Facts | Critical | Medium |
| Write Contamination — Tenant A writes to Tenant B’s ledger | Critical | Low |
| Resource Exhaustion — Tenant A depletes Tenant B’s quota | High | High |
| Enumeration — Tenant A discovers Tenant B exists | Medium | Medium |
| Timing Attack — Tenant A infers Tenant B’s activity from latency | Low | Low |
| Privilege Escalation — User becomes admin without authorization | Critical | Low |
| Audit Evasion — Actions occur without Facts | Critical | Low |
Threat Actors
Section titled “Threat Actors”| Actor | Capabilities | Motivation |
|---|---|---|
| Malicious Tenant | Valid API credentials, knowledge of API | Competitive intelligence, data theft |
| Compromised User | Stolen API key, limited scope | Opportunistic access, pivot to full tenant |
| Insider | Platform access, elevated privileges | Sabotage, exfiltration |
| External Attacker | No initial access, network probing | Credential theft, system disruption |
Isolation Layers
Section titled “Isolation Layers”z0 implements 6 defense layers. Each layer assumes the layer above it can be breached.
Layer 1: Network Isolation
Section titled “Layer 1: Network Isolation”Purpose: Prevent unauthorized traffic from reaching the platform.
Internet → Cloudflare CDN → tenant-specific subdomain → Dispatch WorkerControls:
| Control | Mechanism | Enforcement |
|---|---|---|
| Tenant subdomain routing | {tenant_id}.web1.co → tenant namespace | DNS + WfP |
| IP allowlist (optional) | Per-tenant IP ranges | Cloudflare WAF |
| Rate limiting (edge) | Per-tenant quotas | Cloudflare Rate Limiting |
| DDoS protection | Adaptive thresholds | Cloudflare Magic Transit |
Configuration:
# Per-tenant network policyNetworkPolicy: tenant_id: tenant_acme subdomain: acme.web1.co ip_allowlist: - 203.0.113.0/24 # Acme office network rate_limit: requests_per_second: 100 burst: 200Breach Impact: Attacker reaches Dispatch Worker but has no credentials.
Layer 2: Authentication
Section titled “Layer 2: Authentication”Purpose: Verify the caller’s identity before processing requests.
// Dispatch Worker: Extract tenant from authenticationasync function authenticateRequest(request: Request): Promise<AuthContext> { const authHeader = request.headers.get('Authorization'); if (!authHeader) { throw new UnauthorizedError('Missing authorization'); }
// API Key format: z0_live_tenant_acme_abc123... // Structure: z0_{env}_{tenant_id}_{secret} const apiKey = authHeader.replace('Bearer ', ''); const [prefix, env, tenantId, ...secretParts] = apiKey.split('_');
if (prefix !== 'z0' || !tenantId) { throw new UnauthorizedError('Invalid API key format'); }
// Verify key against KV (hashed) const storedHash = await env.API_KEYS.get(`key:${apiKey.substring(0, 32)}`); const actualHash = await sha256(apiKey);
if (storedHash !== actualHash) { throw new UnauthorizedError('Invalid API key'); }
// tenant_id extracted from authenticated key structure, NOT from request return { tenant_id: tenantId, environment: env, key_id: apiKey.substring(0, 32), scopes: await loadScopes(tenantId, apiKey) };}Critical Invariant:
// tenant_id ALWAYS from auth token, NEVER from request params/body/headersfunction validateTenantId(requestTenantId: string | null, authTenantId: string): void { // If request specifies tenant_id, it MUST match auth if (requestTenantId && requestTenantId !== authTenantId) { throw new ForbiddenError('Tenant mismatch'); }}API Key Structure:
| Component | Example | Purpose |
|---|---|---|
| Prefix | z0 | Identify z0 keys (for secret scanning) |
| Environment | live, test | Prevent test keys in production |
| Tenant ID | tenant_acme | Bind key to specific tenant |
| Secret | abc123... (32+ chars) | Cryptographic entropy |
Key Rotation:
// Keys have expiration and rotation supportinterface ApiKeyMetadata { key_id: string; tenant_id: string; created_at: number; expires_at: number | null; scopes: string[]; last_used_at: number; rotation_parent_id: string | null; // For graceful rotation}
// Rotation creates new key, old key valid for 7 daysasync function rotateApiKey(oldKeyId: string): Promise<string> { const oldKey = await getKeyMetadata(oldKeyId); const newKey = await generateApiKey(oldKey.tenant_id, oldKey.scopes);
await setKeyMetadata(newKey.key_id, { ...newKey, rotation_parent_id: oldKeyId });
await setKeyMetadata(oldKeyId, { ...oldKey, expires_at: Date.now() + 7 * 24 * 60 * 60 * 1000 // 7 days });
return formatApiKey(newKey);}Breach Impact: Attacker has valid credentials for Tenant A. Layers 3-6 prevent access to Tenant B.
Layer 3: Authorization
Section titled “Layer 3: Authorization”Purpose: Ensure authenticated principal has permission for the requested action.
Authorization Model:
interface AuthContext { tenant_id: string; // From API key structure user_id?: string; // Optional: user context scopes: string[]; // API key scopes role?: string; // User role (if user context exists)}
// Scopes define capability boundariestype Scope = | 'read:facts' // Read Facts for this tenant | 'write:facts' // Append Facts | 'read:configs' // Read Configs | 'write:configs' // Create/update Configs | 'read:entities' // Read Entities | 'write:entities' // Create/update Entities | 'read:billing' // View invoices/statements | 'admin:all'; // Full tenant admin (dangerous)
// Check scope before operationsfunction requireScope(authCtx: AuthContext, requiredScope: Scope): void { if (!authCtx.scopes.includes(requiredScope) && !authCtx.scopes.includes('admin:all')) { throw new ForbiddenError(`Missing required scope: ${requiredScope}`); }}User Access Control (via Facts):
// User access is tracked via Facts, not mutable stateasync function checkUserAccess(userId: string, entityId: string, permission: string): Promise<boolean> { // Query access_* Facts for this user+entity const accessFacts = await queryFacts({ tenant_id: authCtx.tenant_id, // Scoped to tenant type: ['access_granted', 'access_modified', 'access_revoked'], user_id: userId, entity_id: entityId });
// Build current access state from Fact history const accessState = deriveAccessState(accessFacts);
return accessState.permissions.includes(permission);}
// Derive current access from Fact ledgerfunction deriveAccessState(facts: Fact[]): AccessState { const sorted = facts.sort((a, b) => a.timestamp - b.timestamp); let state: AccessState = { permissions: [], role: null };
for (const fact of sorted) { switch (fact.type) { case 'access_granted': state.permissions = fact.data.permissions; state.role = fact.data.role; break; case 'access_modified': state.permissions = fact.data.permissions; state.role = fact.data.role; break; case 'access_revoked': state.permissions = []; state.role = null; break; } }
return state;}Access Fact Schema:
// Grant accessFact { type: 'access_granted', user_id: 'user_jane', entity_id: 'account_acme', tenant_id: 'tenant_acme', data: { role: 'admin', permissions: ['read', 'write', 'delete'], granted_by: 'user_admin', reason: 'Onboarded as account admin' }}
// Modify accessFact { type: 'access_modified', user_id: 'user_jane', entity_id: 'account_acme', tenant_id: 'tenant_acme', data: { role: 'viewer', permissions: ['read'], modified_by: 'user_admin', reason: 'Reduced to read-only access' }}
// Revoke accessFact { type: 'access_revoked', user_id: 'user_jane', entity_id: 'account_acme', tenant_id: 'tenant_acme', data: { revoked_by: 'user_admin', reason: 'Employee terminated' }}Breach Impact: Attacker has valid Tenant A credentials with scopes. Layers 4-6 prevent data contamination.
Layer 4: Data Isolation (tenant_id Enforcement)
Section titled “Layer 4: Data Isolation (tenant_id Enforcement)”Purpose: Ensure every data operation is scoped to the authenticated tenant.
Critical Rule: tenant_id is denormalized on all Entities and Facts for O(1) scoping.
// CORRECT: tenant_id from auth contextasync function queryFacts(authCtx: AuthContext, filters: Filters): Promise<Fact[]> { // tenant_id injected from auth, NOT from request const facts = await db.query(` SELECT * FROM facts WHERE tenant_id = ? AND type = ? AND timestamp >= ? `, [authCtx.tenant_id, filters.type, filters.since]);
return facts;}
// WRONG: tenant_id from request (can be tampered)async function queryFactsWrong(request: Request): Promise<Fact[]> { const { tenant_id, type, since } = await request.json(); // NEVER DO THIS return db.query(`SELECT * FROM facts WHERE tenant_id = ?`, [tenant_id]);}Append Fact Validation:
async function appendFact(authCtx: AuthContext, fact: Fact): Promise<Fact> { // Force tenant_id from auth, ignore request value fact.tenant_id = authCtx.tenant_id;
// Validate entity belongs to this tenant (if specified) if (fact.entity_id) { const entity = await getEntity(fact.entity_id); if (entity.tenant_id !== authCtx.tenant_id) { throw new ForbiddenError('Entity does not belong to this tenant'); } }
// Validate campaign belongs to this tenant if (fact.campaign_id) { const campaign = await getEntity(fact.campaign_id); if (campaign.tenant_id !== authCtx.tenant_id) { throw new ForbiddenError('Campaign does not belong to this tenant'); } }
// Append to tenant-scoped DO const accountDO = getAccountDO(authCtx.tenant_id, fact.entity_id); return accountDO.appendFact(fact);}Durable Object Isolation:
// DO IDs are tenant-scoped by constructionfunction getAccountDO(tenantId: string, accountId: string): DurableObjectStub { // Validate accountId belongs to tenantId if (!accountId.startsWith(`acct_${tenantId}_`)) { throw new ForbiddenError('Account ID does not belong to tenant'); }
const doId = env.ACCOUNT_LEDGER.idFromName(`${tenantId}_${accountId}`); return env.ACCOUNT_LEDGER.get(doId);}
// Workers for Platforms isolationfunction getTenantWorker(tenantId: string): Fetcher { // Each tenant has isolated WfP script return env.TENANT_WORKERS.get(tenantId);}Global Entities (tenant_id = null):
// Tools, vendors, users are global (shared across tenants)interface GlobalEntity { id: string; type: 'tool' | 'vendor' | 'user'; tenant_id: null; // Explicitly null // ...}
// Access to global entities is READ-ONLY for tenantsasync function getTool(toolId: string): Promise<Tool> { const tool = await db.queryOne(` SELECT * FROM entities WHERE id = ? AND type = 'tool' AND tenant_id IS NULL `, [toolId]);
if (!tool) { throw new NotFoundError('Tool not found'); }
return tool;}
// Tenants CANNOT modify global entitiesasync function updateTool(authCtx: AuthContext, toolId: string, updates: Partial<Tool>): Promise<void> { // Only platform account can modify global entities if (authCtx.tenant_id !== 'platform') { throw new ForbiddenError('Cannot modify global entities'); }
// Update logic...}Breach Impact: Attacker has compromised tenant_id enforcement at query layer. Layers 5-6 provide defense in depth.
Layer 5: Storage Isolation (WfP + DO Namespaces)
Section titled “Layer 5: Storage Isolation (WfP + DO Namespaces)”Purpose: Physical separation of tenant data at infrastructure level.
Workers for Platforms Isolation:
┌─────────────────────────────────────────────────────────┐│ Dispatch Worker (z0-controlled) ││ - Authenticates request ││ - Extracts tenant_id from API key ││ - Routes to tenant-specific user worker │└─────────────────────────────────────────────────────────┘ │ │ env.TENANT_WORKERS.get(tenant_id) ▼┌─────────────────────────────────────────────────────────┐│ WfP Dispatch Namespace: z0-prod-tenants ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││ │ tenant_a │ │ tenant_b │ │ tenant_c │ ││ │ - Own V8 │ │ - Own V8 │ │ - Own V8 │ ││ │ - Own KV │ │ - Own KV │ │ - Own KV │ ││ │ - Own D1 │ │ - Own D1 │ │ - Own D1 │ ││ │ - Own DOs │ │ - Own DOs │ │ - Own DOs │ ││ └─────────────┘ └─────────────┘ └─────────────┘ │└─────────────────────────────────────────────────────────┘Isolation Boundaries:
| Boundary | Mechanism | Prevents |
|---|---|---|
| Memory | Separate V8 isolates | Memory corruption, shared state |
| CPU | Per-isolate CPU time limits | Resource exhaustion |
| Storage | Per-tenant DO namespaces | Cross-tenant data access |
| Network | Outbound-only, no inter-worker calls | Lateral movement |
| Bindings | Tenant-specific KV/D1/R2/secrets | Credential leakage |
DO Namespace Isolation:
// Tenant A's DOs live in tenant_a namespaceconst tenantAAccountDO = env.ACCOUNT_LEDGER.idFromName('tenant_a_account_acme');
// Tenant B cannot access Tenant A's DO even if they know the IDconst attemptedAccess = env.ACCOUNT_LEDGER.idFromName('tenant_a_account_acme');// This fails at runtime because tenant_b worker doesn't have binding to tenant_a namespacePer-Tenant Resource Bindings:
// Deployed per-tenant at WfP script creation{ "bindings": [ { "type": "kv_namespace", "name": "TENANT_KV", "namespace_id": "kv_tenant_acme_..." // Tenant-specific KV }, { "type": "d1", "name": "TENANT_DB", "database_id": "d1_tenant_acme_..." // Tenant-specific D1 }, { "type": "durable_object_namespace", "name": "ACCOUNT_LEDGER", "class_name": "AccountDO", "script_name": "tenant_acme" // Scoped to tenant script }, { "type": "secret_text", "name": "TWILIO_AUTH_TOKEN", "text": "<encrypted_tenant_specific_token>" } ]}Breach Impact: Attacker has compromised WfP isolation (unlikely, requires Cloudflare infrastructure breach). Layer 6 provides final defense.
Layer 6: Audit Trail (Facts as Immutable Log)
Section titled “Layer 6: Audit Trail (Facts as Immutable Log)”Purpose: Every action creates a Fact. Audit trail cannot be tampered.
// All sensitive operations create audit Factsasync function grantUserAccess(authCtx: AuthContext, userId: string, entityId: string, permissions: string[]): Promise<void> { // Verify caller has permission to grant access requireScope(authCtx, 'admin:all');
// Create access_granted Fact await appendFact(authCtx, { type: 'access_granted', user_id: userId, entity_id: entityId, tenant_id: authCtx.tenant_id, data: { role: 'admin', permissions, granted_by: authCtx.user_id || authCtx.key_id, granted_at: Date.now(), ip_address: request.headers.get('CF-Connecting-IP'), user_agent: request.headers.get('User-Agent') } });
// Fact is immutable, cannot be deleted or modified // Revocation creates new access_revoked Fact}Audit Fact Types:
| Type | Purpose | Required Fields |
|---|---|---|
| access_granted | User given access | user_id, entity_id, permissions |
| access_modified | Permissions changed | user_id, entity_id, new_permissions |
| access_revoked | Access removed | user_id, entity_id, revoked_by |
| config_created | Config created | config_id, version, settings |
| config_updated | Config updated | config_id, version, old_settings, new_settings |
| entity_created | Entity created | entity_id, type, subtype |
| entity_updated | Entity updated | entity_id, changes |
| lifecycle | Entity status changed | entity_id, old_status, new_status |
| error | Error occurred | error_code, message, context |
Audit Query API:
// Compliance: export all audit trail for tenantasync function exportAuditTrail(authCtx: AuthContext, startDate: Date, endDate: Date): Promise<Fact[]> { requireScope(authCtx, 'admin:all');
const auditFacts = await queryFacts({ tenant_id: authCtx.tenant_id, type: ['access_granted', 'access_modified', 'access_revoked', 'config_updated', 'lifecycle'], timestamp_gte: startDate.getTime(), timestamp_lte: endDate.getTime() });
return auditFacts;}
// Security: detect suspicious access patternsasync function detectAnomalousAccess(tenantId: string): Promise<Alert[]> { const recentAccess = await queryFacts({ tenant_id: tenantId, type: ['access_granted', 'access_modified'], timestamp_gte: Date.now() - 24 * 60 * 60 * 1000 // Last 24h });
const alerts = [];
// Pattern: Multiple users granted admin in short time const adminGrants = recentAccess.filter(f => f.type === 'access_granted' && f.data.role === 'admin' );
if (adminGrants.length > 3) { alerts.push({ severity: 'high', type: 'privilege_escalation_burst', count: adminGrants.length, facts: adminGrants }); }
return alerts;}Breach Impact: Attacker has full system compromise. Immutable audit trail enables forensic investigation and regulatory compliance.
Resource Isolation (Noisy Neighbor Protection)
Section titled “Resource Isolation (Noisy Neighbor Protection)”Per-Tenant Quotas
Section titled “Per-Tenant Quotas”interface TenantQuotas { tenant_id: string; tier: 'standard' | 'professional' | 'enterprise'; limits: { // Request limits requests_per_second: number; requests_per_day: number; burst_size: number;
// Data limits facts_per_day: number; entities_max: number; storage_gb: number;
// Compute limits cpu_ms_per_request: number; concurrent_requests: number;
// Feature limits assets_max: number; campaigns_max: number; users_max: number; };}
const TIER_QUOTAS: Record<string, TenantQuotas['limits']> = { standard: { requests_per_second: 100, requests_per_day: 100_000, burst_size: 200, facts_per_day: 100_000, entities_max: 1_000, storage_gb: 10, cpu_ms_per_request: 50, concurrent_requests: 50, assets_max: 100, campaigns_max: 10, users_max: 5 }, professional: { requests_per_second: 500, requests_per_day: 1_000_000, burst_size: 1000, facts_per_day: 1_000_000, entities_max: 10_000, storage_gb: 100, cpu_ms_per_request: 200, concurrent_requests: 200, assets_max: 1_000, campaigns_max: 100, users_max: 25 }, enterprise: { requests_per_second: 2000, requests_per_day: 10_000_000, burst_size: 5000, facts_per_day: 10_000_000, entities_max: 100_000, storage_gb: 1000, cpu_ms_per_request: 1000, concurrent_requests: 1000, assets_max: -1, // Unlimited campaigns_max: -1, users_max: -1 }};Rate Limiting (Edge + Application)
Section titled “Rate Limiting (Edge + Application)”// Layer 1: Cloudflare Rate Limiting (edge)// Configured via Cloudflare API, enforced before reaching Worker
// Layer 2: Application-level rate limitingclass RateLimiter { private env: Env;
async checkRateLimit(tenantId: string, operation: string): Promise<void> { const quotas = await this.getQuotas(tenantId); const key = `ratelimit:${tenantId}:${operation}`;
// Use Cloudflare Rate Limiting API const { success, limit, remaining, reset } = await this.env.RATE_LIMITER.limit({ key, limit: quotas.limits.requests_per_second, period: 1 // 1 second });
if (!success) { throw new RateLimitError('Rate limit exceeded', { limit, remaining, reset }); } }
async trackDailyUsage(tenantId: string, operation: string): Promise<void> { const key = `usage:daily:${tenantId}:${operation}:${this.getDateKey()}`; const count = await this.env.KV.get(key);
const quotas = await this.getQuotas(tenantId); const newCount = (parseInt(count || '0') + 1);
if (newCount > quotas.limits.requests_per_day) { throw new QuotaExceededError('Daily quota exceeded'); }
await this.env.KV.put(key, newCount.toString(), { expirationTtl: 86400 * 2 // 2 days }); }
private getDateKey(): string { const now = new Date(); return `${now.getUTCFullYear()}-${now.getUTCMonth() + 1}-${now.getUTCDate()}`; }}Priority Queuing by Tier
Section titled “Priority Queuing by Tier”// Queue requests by tenant tierenum QueuePriority { ENTERPRISE = 0, // Highest PROFESSIONAL = 1, STANDARD = 2, // Lowest INTERNAL = -1 // Platform operations (super high priority)}
async function enqueueRequest(tenantId: string, request: Request): Promise<void> { const quotas = await getQuotas(tenantId); const priority = { 'enterprise': QueuePriority.ENTERPRISE, 'professional': QueuePriority.PROFESSIONAL, 'standard': QueuePriority.STANDARD }[quotas.tier];
await env.REQUEST_QUEUE.send({ tenant_id: tenantId, request: serializeRequest(request), priority, enqueued_at: Date.now() });}
// Queue consumer processes by priorityasync function processQueue(batch: Message[]): Promise<void> { // Sort by priority (lower number = higher priority) const sorted = batch.sort((a, b) => a.priority - b.priority);
for (const message of sorted) { await processRequest(message); }}CPU Time Limits (per DO, per request)
Section titled “CPU Time Limits (per DO, per request)”class AccountDO extends DurableObject { async fetch(request: Request): Promise<Response> { const startTime = Date.now(); const tenantId = await extractTenantId(request); const quotas = await getQuotas(tenantId);
try { const result = await this.handleRequest(request);
// Track CPU time const cpuTime = Date.now() - startTime; if (cpuTime > quotas.limits.cpu_ms_per_request) { // Log slow operation, may throttle future requests await this.reportSlowOperation(tenantId, cpuTime); }
return result; } catch (error) { // CPU timeout handled by Cloudflare Workers runtime if (error.name === 'TimeoutError') { throw new ResourceExhaustedError('Request CPU time limit exceeded'); } throw error; } }
private async reportSlowOperation(tenantId: string, cpuTime: number): Promise<void> { await appendFact({ type: 'error', subtype: 'slow_operation', tenant_id: tenantId, data: { cpu_time_ms: cpuTime, exceeded_limit: true } }); }}Enumeration Prevention
Section titled “Enumeration Prevention”Goal: Tenant A cannot discover Tenant B exists or infer Tenant B’s activity.
No Tenant ID Leakage in Errors
Section titled “No Tenant ID Leakage in Errors”// WRONG: Leaks tenant existenceasync function getAccount(accountId: string): Promise<Account> { const account = await db.queryOne('SELECT * FROM accounts WHERE id = ?', [accountId]); if (!account) { throw new NotFoundError(`Account ${accountId} not found`); // Leaks that accountId format is valid } return account;}
// CORRECT: Generic errorsasync function getAccount(authCtx: AuthContext, accountId: string): Promise<Account> { const account = await db.queryOne( 'SELECT * FROM accounts WHERE id = ? AND tenant_id = ?', [accountId, authCtx.tenant_id] );
if (!account) { // Same error whether account doesn't exist or belongs to another tenant throw new NotFoundError('Account not found'); }
return account;}Constant-Time Tenant Resolution
Section titled “Constant-Time Tenant Resolution”// Prevent timing attacks that infer tenant existenceasync function resolveTenant(tenantId: string): Promise<Tenant | null> { const start = Date.now();
// Always query, even if tenantId format is invalid const tenant = await db.queryOne( 'SELECT * FROM tenants WHERE id = ?', [tenantId] );
// Add random jitter to prevent timing analysis const elapsed = Date.now() - start; const jitter = Math.random() * 10; // 0-10ms await sleep(Math.max(0, 50 - elapsed + jitter)); // Target 50ms ± jitter
return tenant;}No Global Metrics
Section titled “No Global Metrics”// WRONG: Global metrics leak cross-tenant informationconst totalFactsCreatedToday = await db.queryOne( 'SELECT COUNT(*) FROM facts WHERE timestamp >= ?', [startOfDay]);
// CORRECT: Per-tenant metrics onlyconst tenantFactsCreatedToday = await db.queryOne( 'SELECT COUNT(*) FROM facts WHERE tenant_id = ? AND timestamp >= ?', [authCtx.tenant_id, startOfDay]);Validation Checklist
Section titled “Validation Checklist”Before deploying any feature that touches multi-tenant data:
Data Isolation
Section titled “Data Isolation”- All queries include
WHERE tenant_id = ?with value from auth context -
tenant_idextracted from API key structure, never from request params/body/headers - Cross-tenant entity references validated (e.g., campaign belongs to tenant)
- Global entities (tools, vendors, users) have
tenant_id = nulland are read-only - Durable Object IDs are tenant-scoped (
${tenantId}_${entityId}) - Workers for Platforms scripts deployed per-tenant with isolated bindings
Authorization
Section titled “Authorization”- API key scopes checked before sensitive operations
- User access verified via access_* Facts, not mutable state
- Admin operations require
admin:allscope - Service-to-service calls use platform credentials, not tenant credentials
Resource Limits
Section titled “Resource Limits”- Rate limiting enforced at edge (Cloudflare) and application layer
- Daily quotas tracked per tenant
- CPU time limits configured per tier
- Storage quotas monitored, alerts on 80% usage
- Priority queuing implemented for tiered service
Audit Trail
Section titled “Audit Trail”- All access grants/modifications/revocations create Facts
- All Config changes create Facts
- All Entity status changes create lifecycle Facts
- Errors create error Facts (not just logs)
- Audit trail queryable by tenant admins
Enumeration Prevention
Section titled “Enumeration Prevention”- Errors do not leak tenant IDs or entity existence
- Timing attacks mitigated with constant-time operations + jitter
- No global metrics exposed via API
- Tenant subdomains do not enumerate (e.g., no
tenant-1.web1.co,tenant-2.web1.co)
Network Isolation
Section titled “Network Isolation”- Tenant-specific subdomains configured (optional)
- IP allowlists enforced if configured
- DDoS protection enabled at edge
- No cross-tenant network calls possible
Incident Response
Section titled “Incident Response”Breach Detection
Section titled “Breach Detection”// Automated detection of potential breachesasync function detectSecurityAnomalies(tenantId: string): Promise<Alert[]> { const alerts: Alert[] = [];
// 1. Detect cross-tenant query attempts const crossTenantAttempts = await db.query(` SELECT * FROM error_logs WHERE tenant_id = ? AND error_code = 'FORBIDDEN' AND message LIKE '%tenant mismatch%' AND timestamp >= ? `, [tenantId, Date.now() - 3600000]); // Last hour
if (crossTenantAttempts.length > 5) { alerts.push({ severity: 'critical', type: 'cross_tenant_access_attempt', tenant_id: tenantId, count: crossTenantAttempts.length, facts: crossTenantAttempts }); }
// 2. Detect privilege escalation const recentAdminGrants = await queryFacts({ tenant_id: tenantId, type: 'access_granted', data: { role: 'admin' }, timestamp_gte: Date.now() - 3600000 });
if (recentAdminGrants.length > 3) { alerts.push({ severity: 'high', type: 'privilege_escalation_burst', tenant_id: tenantId, count: recentAdminGrants.length }); }
// 3. Detect unusual access patterns const unusualAccess = await detectUnusualAccessPattern(tenantId); if (unusualAccess) { alerts.push(unusualAccess); }
return alerts;}Breach Containment
Section titled “Breach Containment”// Suspend tenant immediately if breach detectedasync function suspendTenant(tenantId: string, reason: string): Promise<void> { // 1. Update tenant status await db.exec( 'UPDATE tenants SET status = ?, suspended_at = ?, suspend_reason = ? WHERE id = ?', ['suspended', Date.now(), reason, tenantId] );
// 2. Invalidate all API keys await invalidateAllApiKeys(tenantId);
// 3. Create audit Fact await appendFact({ type: 'lifecycle', subtype: 'tenant_suspended', entity_id: tenantId, tenant_id: 'platform', // Platform-level action data: { suspended_by: 'security_automation', reason, suspended_at: Date.now() } });
// 4. Alert security team await notifySecurityTeam({ type: 'tenant_suspended', tenant_id: tenantId, reason });}Forensic Analysis
Section titled “Forensic Analysis”// Export complete audit trail for investigationasync function exportForensicData(tenantId: string, startDate: Date, endDate: Date): Promise<ForensicExport> { const [facts, entities, configs, errors] = await Promise.all([ queryFacts({ tenant_id: tenantId, timestamp_gte: startDate.getTime(), timestamp_lte: endDate.getTime() }), queryEntities({ tenant_id: tenantId, updated_at_gte: startDate.getTime() }), queryConfigs({ tenant_id: tenantId, effective_at_gte: startDate.getTime() }), queryErrorLogs({ tenant_id: tenantId, timestamp_gte: startDate.getTime() }) ]);
return { tenant_id: tenantId, export_date: new Date().toISOString(), period: { start: startDate.toISOString(), end: endDate.toISOString() }, facts, entities, configs, errors, integrity_hash: await computeIntegrityHash(facts) };}
// Verify audit trail has not been tamperedasync function computeIntegrityHash(facts: Fact[]): Promise<string> { const sorted = facts.sort((a, b) => a.timestamp - b.timestamp); const chain = sorted.map(f => `${f.id}:${f.timestamp}:${f.type}`).join('|'); return await sha256(chain);}Summary
Section titled “Summary”| Layer | Purpose | Mechanism | Breach Impact |
|---|---|---|---|
| 1. Network | Filter unauthorized traffic | Cloudflare WAF, rate limiting, IP allowlist | Attacker reaches Dispatch Worker |
| 2. Authentication | Verify caller identity | API keys with tenant_id in structure | Attacker has Tenant A credentials |
| 3. Authorization | Verify permissions | Scopes + access_* Facts | Attacker has valid scoped access |
| 4. Data Isolation | Scope all queries to tenant | tenant_id from auth injected into queries | Query-level enforcement bypassed |
| 5. Storage Isolation | Physical separation | WfP namespaces + per-tenant DOs | Infrastructure-level breach |
| 6. Audit Trail | Tamper-proof log | Immutable Facts | Full compromise, forensics enabled |
Defense in Depth: Each layer assumes the layer above it can fail. Breach at Layer 3 (authorization) still prevented from accessing other tenants’ data by Layers 4-6.
Key Principles:
- tenant_id from auth, never from request — Non-negotiable invariant
- Deny by default — No access unless explicitly granted
- Fail closed — When uncertain, deny access
- Audit everything — Every action creates a Fact
- Assume breach — Each layer defends against failures in layers above
- No enumeration — Tenant A cannot detect Tenant B exists
Compliance: This architecture satisfies SOC 2, ISO 27001, GDPR, and HIPAA requirements for multi-tenant data isolation and audit logging.
Recent Security Fixes (SDK Separation Refactoring)
Section titled “Recent Security Fixes (SDK Separation Refactoring)”The SDK separation refactoring (2026-01-19) identified and fixed 10+ critical tenant isolation vulnerabilities in database queries, API endpoints, and webhook handling. This section documents the specific vulnerabilities found and the fixes applied.
Vulnerability #1: Database Query Functions Missing tenant_id Filtering
Section titled “Vulnerability #1: Database Query Functions Missing tenant_id Filtering”Severity: CRITICAL
Impact: Cross-tenant data leakage via ID enumeration
Files: src/db/queries.ts
Problem
Section titled “Problem”Six database query functions lacked mandatory tenant_id filtering, allowing attackers to access entities from any tenant by guessing or enumerating IDs:
| Vulnerable Function | Missing Filter |
|---|---|
getEntity(db, id) | AND tenant_id = ? |
getFact(db, id) | AND tenant_id = ? |
getConfigVersion(db, id, version) | AND tenant_id = ? |
getConfigAtTime(db, id, timestamp) | AND tenant_id = ? |
getConfigHistory(db, id) | AND tenant_id = ? |
getChildEntities(db, parentId) | AND tenant_id = ? |
Attack Scenario:
// Attacker obtains entity ID from their tenantconst myEntity = 'ent_abc123';
// Attacker guesses ID from another tenantconst victimEntity = 'ent_abc124';
// Vulnerable query returns victim's entity!const leaked = await getEntity(db, victimEntity);// No tenant validation → complete cross-tenant breachFix Applied
Section titled “Fix Applied”All query functions now require mandatory tenantId parameter:
// BEFORE (vulnerable)export async function getEntity( db: D1Database, id: string): Promise<EntityRow | null> { return db.prepare(`SELECT * FROM entities WHERE id = ?`) .bind(id).first();}
// AFTER (secure)export async function getEntity( db: D1Database, id: string, tenantId: string // Mandatory parameter): Promise<EntityRow | null> { return db.prepare(` SELECT * FROM entities WHERE id = ? AND tenant_id = ? // Always filter `).bind(id, tenantId).first();}Callsite Updates: All 50+ callsites updated to pass auth.tenantId from request context.
Vulnerability #2: Campaign Relationship Queries Missing Tenant Validation
Section titled “Vulnerability #2: Campaign Relationship Queries Missing Tenant Validation”Severity: CRITICAL
Impact: Cross-tenant campaign access via parent_id traversal
Files: src/api/routes/voice.ts
Problem
Section titled “Problem”Three campaign resolution queries in voice API lacked tenant validation:
Location 1 (line 350-352):
// Vulnerable: No tenant validation on parent campaigncampaignEntity = await c.env.z0_DB.prepare( 'SELECT * FROM entities WHERE id = ?').bind(entity.parent_id).first<Entity>();Location 2 (line 723-725):
// Vulnerable: Campaign lookup without tenant checkconst campaign = (fact as any).campaign_id ? await c.env.z0_DB.prepare('SELECT * FROM entities WHERE id = ?') .bind((fact as any).campaign_id).first<Entity>() : null;Location 3 (line 650-658):
// Vulnerable: Bulk entity fetch without tenant filteringconst entityResults = await c.env.z0_DB.prepare(` SELECT * FROM entities WHERE id IN (${entityIds.map(() => '?').join(',')})`).bind(...entityIds).all<Entity>();Attack Scenario:
// Attacker creates entity with parent_id pointing to victim's campaignawait createEntity({ id: 'ent_attacker', parent_id: 'camp_victim_123', // Points to victim tenant tenant_id: 'attacker_tenant'});
// API resolves parent campaign without tenant validationconst campaign = await getParentCampaign(attackerEntity.parent_id);// Returns victim's campaign data!Fix Applied
Section titled “Fix Applied”All campaign relationship queries now validate tenant ownership:
// AFTER (secure) - Location 1 fixedcampaignEntity = await c.env.z0_DB.prepare( 'SELECT * FROM entities WHERE id = ? AND tenant_id = ?').bind(entity.parent_id, auth.tenantId).first<Entity>();
// Verify campaign exists and belongs to this tenantif (!campaignEntity) { throw Errors.badRequest('Invalid campaign reference');}
// AFTER (secure) - Locations 2 & 3 use same patternAdditional Safeguard: Parent ID validation at entity creation prevents cross-tenant references from being created.
Vulnerability #3: Webhook Tenant Injection via Phone Number Collisions
Section titled “Vulnerability #3: Webhook Tenant Injection via Phone Number Collisions”Severity: HIGH
Impact: Call attribution to wrong tenant, revenue misdirection
Files: src/api/webhooks/twilio.ts
Problem
Section titled “Problem”Twilio webhooks arrive before authentication with only phone number to identify asset. If multiple tenants configured the same tracking number, lookupAssetByPhone() returned first match without tenant context.
Vulnerable Code:
async function lookupAssetByPhone(phone: string, env: Env): Promise<Entity | null> { return env.z0_DB.prepare(` SELECT * FROM entities WHERE type = 'asset' AND ${Schema.Asset.Subtype} = 'phone' AND ${Schema.Asset.Identifier} = ? AND ${Schema.Asset.Status} = 'active' LIMIT 1 // ← Returns FIRST match, not necessarily correct tenant! `).bind(phone).first();}
// Webhook handler uses asset's tenant_id for factsconst tenantId = asset.tenant_id ?? 'unknown'; // ← Accepts 'unknown'!Attack Scenario:
- Victim tenant configures tracking number
+1-555-0100 - Attacker tenant also configures
+1-555-0100(different Twilio account) - Twilio webhook arrives for
+1-555-0100 lookupAssetByPhone()returns attacker’s asset (first LIMIT 1 result)- Call facts created under attacker’s
tenant_id - Victim loses call tracking, attacker gains unauthorized call data
Fix Applied
Section titled “Fix Applied”1. Unique Constraint (migrations/0005_unique_phone_numbers.sql):
-- Prevent duplicate phone numbers across tenantsCREATE UNIQUE INDEX idx_unique_active_phone_numbers ON entities(ix_s_3) -- Schema.Asset.Identifier (phone number) WHERE type = 'asset' AND ix_s_4 = 'phone' -- Schema.Asset.Subtype AND ix_s_1 = 'active'; -- Schema.Asset.Status2. Fail-Fast Validation:
// Validate tenant_id exists (fail fast, no fallback)if (!asset.tenant_id) { console.error(`[Twilio] asset ${asset.id} has no tenant_id - data corruption`);
// Create error fact for traceability await appendFact({ type: 'error', subtype: 'webhook_processing_failed', tenant_id: 'platform', // Platform-level error data: { error: 'missing_tenant_id', asset_id: asset.id, phone_number: from } });
// Return TwiML error instead of processing with 'unknown' return twimlResponse(`<?xml version="1.0" encoding="UTF-8"?> <Response> <Say>System error: invalid configuration</Say> <Hangup/> </Response>`);}3. Error Fact Creation (no silent failures):
try { await createCallFact(asset, twilioData);} catch (error) { // Error facts for traceability (not silent logs) await appendFact({ type: 'error', subtype: 'fact_creation_failed', tenant_id: asset.tenant_id, asset_id: asset.id, data: { error: error.message, twilio_call_sid: twilioData.CallSid } });
throw error; // Don't swallow errors}Vulnerability Summary
Section titled “Vulnerability Summary”| Vulnerability | Severity | Attack Vector | Fix |
|---|---|---|---|
| Database queries missing tenant filter | CRITICAL | ID enumeration | Added mandatory tenantId parameter to 6 functions |
| Campaign relationship traversal | CRITICAL | Parent ID manipulation | Added AND tenant_id = ? to 6 queries |
| Webhook phone number collisions | HIGH | Duplicate phone setup | Unique constraint + fail-fast validation |
Total Vulnerable Code Paths: 10+ Total Fixes Applied: 50+ callsite updates, 1 migration, 3 new validations
Testing Requirements
Section titled “Testing Requirements”All tenant isolation fixes must pass these security tests:
1. Cross-Tenant Access Attempts:
test('getEntity rejects wrong tenant', async () => { const victimEntity = await createEntity({ tenant_id: 'victim' });
// Attacker tries to access victim's entity const result = await getEntity(db, victimEntity.id, 'attacker');
expect(result).toBeNull(); // Returns null, not victim's data});2. Campaign Relationship Validation:
test('parent campaign must belong to same tenant', async () => { const victimCampaign = await createEntity({ type: 'campaign', tenant_id: 'victim' });
// Attacker tries to set victim's campaign as parent await expect( createEntity({ parent_id: victimCampaign.id, tenant_id: 'attacker' }) ).rejects.toThrow('Invalid campaign reference');});3. Phone Number Uniqueness:
test('duplicate phone numbers are rejected', async () => { await createAsset({ identifier: '+1-555-0100', subtype: 'phone', status: 'active', tenant_id: 'tenant_a' });
// Second tenant tries to use same phone await expect( createAsset({ identifier: '+1-555-0100', subtype: 'phone', status: 'active', tenant_id: 'tenant_b' }) ).rejects.toThrow('UNIQUE constraint failed');});Verification Checklist
Section titled “Verification Checklist”Post-fix verification (all must pass):
- All database query functions have mandatory
tenantIdparameter - All callsites pass
auth.tenantIdfrom request context - Campaign relationship queries include
AND tenant_id = ? - Phone number unique constraint deployed via migration
- Webhook handler fails fast on missing
tenant_id - Error facts created for webhook failures (no silent errors)
- Security tests cover cross-tenant access attempts
- Integration tests verify end-to-end tenant isolation
Last Verified: 2026-01-19 Test Pass Rate: 82/170 (48%) - remaining failures unrelated to tenant isolation