D1 (Distributed SQLite)
How z0 uses Cloudflare D1 for queryable projections and cross-entity reporting.
Prerequisites: PRINCIPLES.md, PRIMITIVES.md, durable-objects.md
Overview
Section titled “Overview”D1 is Cloudflare’s distributed SQLite database. In z0, D1 serves as a queryable projection of data whose source of truth lives in Durable Objects.
| Role | Durable Objects | D1 |
|---|---|---|
| Source of truth | Yes | No |
| Write path | Direct | Via replication |
| Query pattern | Single-entity by ID | Cross-entity, aggregations |
| Consistency | Strong | Eventually consistent |
| Use case | Ledger operations, cached state | Reporting, search, dashboards |
Key Insight: Durable Objects answer “what is the current state of entity X?” D1 answers “which entities match this criteria?” and “what happened across all entities?”
Architectural Role
Section titled “Architectural Role”Source of Truth: Durable Objects
Section titled “Source of Truth: Durable Objects”Per Principle 7 (Derived State Is Disposable), D1 is not authoritative. The ledger in Durable Objects is the source of truth:
Write Path: API → Worker → Durable Object (ledger) → Queue → D1 (projection) ↑ Source of truth
Read Path (single entity): API → Worker → Durable Object → Response
Read Path (cross-entity): API → Worker → D1 → ResponseWhen D1 Data Is Lost
Section titled “When D1 Data Is Lost”D1 data can be rebuilt from Durable Object ledgers. This is inconvenient but not catastrophic:
- Replay Facts from DO ledgers to D1
- Queries return stale data during rebuild
- No business data is lost (it lives in DOs)
Replication Pattern
Section titled “Replication Pattern”Facts flow from Durable Objects to D1 asynchronously:
DO Ledger Write │ ▼ Queue (buffered) │ ▼D1 Batch Insert │ ▼Available for QueryReplication Guarantees
Section titled “Replication Guarantees”| Property | Guarantee |
|---|---|
| Ordering | Per-entity FIFO (facts from same DO arrive in order) |
| Delivery | At-least-once (idempotent writes required) |
| Latency | Typically < 1s, target < 10s |
| Durability | Fact is durable in DO before queue acknowledgment |
Idempotent Writes
Section titled “Idempotent Writes”D1 writes use INSERT OR IGNORE with fact_id as primary key:
INSERT OR IGNORE INTO facts (id, type, timestamp, ...)VALUES (?, ?, ?, ...);Duplicate deliveries are safely ignored.
Handling Replication Lag
Section titled “Handling Replication Lag”D1 replicas are eventually consistent. For recent data:
| Need | Pattern |
|---|---|
| User’s own recent actions | Query DO directly |
| Dashboard showing “last 5 minutes” | Show staleness indicator |
| Financial reports | Wait for replication or query DO |
| Cross-entity search | Accept eventual consistency |
Pattern: Freshness check
// Check if D1 has caught up to a known Factasync function ensureFresh(factId: string): Promise<boolean> { const inD1 = await d1.prepare('SELECT 1 FROM facts WHERE id = ?').bind(factId).first(); return inD1 !== null;}Pattern: Staleness indicator
// Surface data freshness to usersinterface QueryResult<T> { data: T; freshness: { source: 'D1' | 'DO'; maxAge: number; // ms since last known replication stale: boolean; // true if maxAge > threshold };}When to fall back to DO queries:
- User just performed an action and expects to see it immediately
- Financial calculations where correctness outweighs latency
- Audit scenarios requiring absolute consistency
When to accept D1 staleness:
- Cross-entity search results (user expects eventual consistency)
- Historical reports (data is old anyway)
- Dashboards with explicit “updated X seconds ago” indicators
Read Replicas and Consistency
Section titled “Read Replicas and Consistency”Global Read Replicas
Section titled “Global Read Replicas”D1 automatically replicates data to edge locations. Reads are served from the nearest replica.
Primary (Write) ──► Replica (US-East) │ ├──────────► Replica (EU-West) │ └──────────► Replica (Asia-Pacific)Consistency Model
Section titled “Consistency Model”| Operation | Consistency |
|---|---|
| Write | Strongly consistent (primary) |
| Read (default) | Eventually consistent (nearest replica) |
| Read (with flag) | Strongly consistent (primary, higher latency) |
When to Use Strong Reads
Section titled “When to Use Strong Reads”// Default: eventually consistent, fastconst results = await db.prepare("SELECT ...").all();
// Strong consistency: use when correctness > latencyconst results = await db.prepare("SELECT ...") .all({ consistency: "strong" });Use strong reads for:
- Financial reconciliation
- Billing calculations
- Audit queries
Use eventual consistency for:
- Dashboards
- Search
- Analytics
Schema Patterns
Section titled “Schema Patterns”Facts Table
Section titled “Facts Table”The core projection of the Fact ledger:
CREATE TABLE facts ( id TEXT PRIMARY KEY, type TEXT NOT NULL, subtype TEXT, timestamp INTEGER NOT NULL, -- Unix epoch ms
tenant_id TEXT NOT NULL,
-- Entity links (nullable) source_id TEXT, entity_id TEXT, asset_id TEXT, campaign_id TEXT, contact_id TEXT, deal_id TEXT, tool_id TEXT, user_id TEXT,
-- Economics from_entity TEXT, to_entity TEXT, amount INTEGER, -- Stored as cents/smallest unit currency TEXT,
-- Audit config_id TEXT, config_version INTEGER,
-- Payload (JSON) data TEXT,
-- Indexing created_at INTEGER DEFAULT (strftime('%s', 'now') * 1000));
-- Query patternsCREATE INDEX idx_facts_tenant_type ON facts(tenant_id, type);CREATE INDEX idx_facts_tenant_timestamp ON facts(tenant_id, timestamp);CREATE INDEX idx_facts_asset ON facts(asset_id) WHERE asset_id IS NOT NULL;CREATE INDEX idx_facts_contact ON facts(contact_id) WHERE contact_id IS NOT NULL;CREATE INDEX idx_facts_deal ON facts(deal_id) WHERE deal_id IS NOT NULL;Entities Table
Section titled “Entities Table”Projection of current Entity state:
CREATE TABLE entities ( id TEXT PRIMARY KEY, type TEXT NOT NULL, subtype TEXT,
name TEXT, identifier TEXT,
tenant_id TEXT, parent_id TEXT, owner_id TEXT,
status TEXT NOT NULL DEFAULT 'active',
created_at INTEGER NOT NULL, updated_at INTEGER NOT NULL,
metadata TEXT -- JSON);
CREATE INDEX idx_entities_tenant_type ON entities(tenant_id, type);CREATE INDEX idx_entities_parent ON entities(parent_id) WHERE parent_id IS NOT NULL;CREATE INDEX idx_entities_identifier ON entities(identifier) WHERE identifier IS NOT NULL;Configs Table
Section titled “Configs Table”Projection of Config versions (current and historical):
CREATE TABLE configs ( id TEXT NOT NULL, version INTEGER NOT NULL, type TEXT NOT NULL,
applies_to TEXT NOT NULL, scope TEXT NOT NULL, tenant_id TEXT,
effective_at INTEGER NOT NULL, superseded_at INTEGER, -- NULL = current
settings TEXT NOT NULL, -- JSON
PRIMARY KEY (id, version));
CREATE INDEX idx_configs_current ON configs(applies_to, type) WHERE superseded_at IS NULL;CREATE INDEX idx_configs_tenant ON configs(tenant_id, type);Query Patterns
Section titled “Query Patterns”Cross-Entity Aggregations
Section titled “Cross-Entity Aggregations”-- Total charges by tenant for a time periodSELECT tenant_id, SUM(amount) as total_cents, currency, COUNT(*) as charge_countFROM factsWHERE type = 'charge' AND timestamp BETWEEN ? AND ?GROUP BY tenant_id, currency;Attribution Queries
Section titled “Attribution Queries”-- All touches for a contactSELECT f.id, f.type, f.subtype, f.timestamp, f.asset_id, e.name as asset_nameFROM facts fLEFT JOIN entities e ON f.asset_id = e.idWHERE f.contact_id = ?ORDER BY f.timestamp;Config History
Section titled “Config History”-- Full history of a pricing configSELECT version, effective_at, superseded_at, settingsFROM configsWHERE id = ?ORDER BY version;
-- Config at a point in timeSELECT settingsFROM configsWHERE id = ? AND effective_at <= ? AND (superseded_at IS NULL OR superseded_at > ?)LIMIT 1;Search
Section titled “Search”-- Find entities by identifier patternSELECT id, type, name, identifierFROM entitiesWHERE tenant_id = ? AND identifier LIKE ? AND status = 'active';Constraints and Limits
Section titled “Constraints and Limits”Database Limits
Section titled “Database Limits”| Limit | Value |
|---|---|
| Max database size | 10 GB |
| Max row size | 1 MB |
| Max query result | 10 MB |
| Max concurrent connections | 100 |
| Max rows per query | Unlimited (but paginate large results) |
Performance Characteristics
Section titled “Performance Characteristics”| Operation | Typical Latency |
|---|---|
| Point query (by PK) | 1-5 ms |
| Index scan | 5-50 ms |
| Full table scan | Avoid |
| Write (insert) | 5-20 ms |
| Batch write (100 rows) | 20-100 ms |
SQLite Considerations
Section titled “SQLite Considerations”D1 is SQLite under the hood. Respect SQLite patterns:
Do:
- Use INTEGER for timestamps (epoch ms)
- Store money as INTEGER (cents)
- Use JSON for flexible data fields
- Create targeted indexes
- Use
EXPLAIN QUERY PLANto verify index usage
Don’t:
- Use DECIMAL/NUMERIC for money (use INTEGER cents)
- Create indexes on JSON fields (extract to columns if needed)
- Rely on concurrent write performance (single writer)
- Use complex JOINs across large tables (denormalize)
Multi-Tenant Isolation
Section titled “Multi-Tenant Isolation”Tenant Scoping
Section titled “Tenant Scoping”Every query must include tenant_id:
-- CORRECT: scoped to tenantSELECT * FROM factsWHERE tenant_id = ? AND type = 'charge';
-- WRONG: scans all tenantsSELECT * FROM facts WHERE type = 'charge';Row-Level Security Pattern
Section titled “Row-Level Security Pattern”For applications requiring strict isolation:
function scopedQuery(db: D1Database, tenantId: string, sql: string, params: any[]) { // Verify tenant_id appears in WHERE clause if (!sql.includes('tenant_id')) { throw new Error('Query must include tenant_id scope'); } return db.prepare(sql).bind(tenantId, ...params).all();}Database Per Tenant
Section titled “Database Per Tenant”For strict isolation requirements, z0 supports database-per-tenant:
Shared D1 (default): └── All tenants in single database, scoped by tenant_id
Isolated D1 (optional): ├── d1_tenant_abc123 ├── d1_tenant_def456 └── d1_tenant_ghi789Trade-offs:
- Shared: Simpler operations, cross-tenant queries possible
- Isolated: Stronger isolation, no cross-tenant query risk, higher operational overhead
Backup and Recovery
Section titled “Backup and Recovery”Built-in Point-in-Time Recovery
Section titled “Built-in Point-in-Time Recovery”D1 provides automatic backups with point-in-time recovery:
# List available restore pointswrangler d1 time-travel list <database>
# Restore to point in timewrangler d1 time-travel restore <database> --timestamp <iso-timestamp>Export
Section titled “Export”# Export full databasewrangler d1 export <database> --output backup.sqlRecovery from DO Ledgers
Section titled “Recovery from DO Ledgers”If D1 is corrupted or needs full rebuild:
- Create fresh D1 database
- Deploy replication worker pointing to new database
- Trigger replay from all DO ledgers
- Switch read traffic when caught up
This follows Principle 7: D1 is disposable, DOs are truth.
Observability
Section titled “Observability”Replication Health
Section titled “Replication Health”Monitor these metrics (see observability.md):
z0_replication_lag_ms // Time from DO write to D1 availabilityz0_replication_queue_depth // Pending facts in queuez0_replication_failures_total // Failed D1 writesQuery Performance
Section titled “Query Performance”z0_d1_query_duration_ms{query_type}z0_d1_rows_scanned_totalz0_d1_rows_returned_totalAlert Thresholds
Section titled “Alert Thresholds”| Metric | Warning | Critical |
|---|---|---|
| Replication lag | > 10s | > 60s |
| Query latency p99 | > 500ms | > 2s |
| Queue depth | > 10,000 | > 100,000 |
Summary
Section titled “Summary”| Question | Answer |
|---|---|
| What is D1? | Distributed SQLite at the edge |
| Is D1 source of truth? | No. Durable Objects are. |
| What is D1 used for? | Cross-entity queries, reporting, search |
| Can D1 data be rebuilt? | Yes, from DO ledgers |
| What’s the consistency model? | Eventually consistent (strong reads available) |
| How does data get to D1? | Async replication via Queue |
D1 enables the query patterns that Durable Objects cannot efficiently support, while maintaining the architectural principle that the ledger (in DOs) is the source of truth.