R2 Object Storage
S3-compatible object storage with zero egress fees. Where z0 stores large blobs.
Prerequisites: PRINCIPLES.md, PRIMITIVES.md
Overview
Section titled “Overview”R2 is Cloudflare’s S3-compatible object storage. Key characteristics:
| Property | Value |
|---|---|
| API Compatibility | S3 (most operations) |
| Egress Fees | Zero |
| Storage Class | Single (no tiers to manage) |
| Consistency | Strong read-after-write |
| Max Object Size | 5 TB |
| Multipart Upload | Yes (required for objects > 100 MB) |
Key Insight: R2’s zero egress fees change the economics of object storage. Large payloads that would be expensive to serve from traditional cloud storage become cost-effective.
How z0 Uses R2
Section titled “How z0 Uses R2”R2 serves three distinct purposes in z0:
| Use Case | What’s Stored | Access Pattern | Retention |
|---|---|---|---|
| Context Store | Call recordings, transcripts, documents | Read by reference from Facts | Days to months |
| Fact Archives | Cold storage for old Facts | Bulk export, compliance queries | Years |
| Backup/Export | Tenant data exports, system backups | On-demand retrieval | Per policy |
1. Context Store
Section titled “1. Context Store”Large payloads associated with invocations and outcomes.
Stored Objects:
- Call recordings (audio files from Twilio)
- Transcripts (text output from speech-to-text)
- Documents (uploaded files, generated reports)
- AI context (conversation history, embeddings)
Naming Convention:
{tenant_id}/context/{entity_type}/{entity_id}/{object_type}/{timestamp}_{id}.{ext}
Examples:tenant_abc/context/invocation/inv_123/recording/2025-01-15_rec_456.mp3tenant_abc/context/invocation/inv_123/transcript/2025-01-15_tr_789.jsontenant_abc/context/asset/asset_001/document/2025-01-15_doc_012.pdfFact Reference Pattern:
Facts don’t store large payloads inline. Instead, they reference R2 objects:
Fact { id: "inv_123", type: "invocation", subtype: "inbound_call", data: { duration_seconds: 342, recording_url: "r2://tenant_abc/context/invocation/inv_123/recording/2025-01-15_rec_456.mp3", transcript_url: "r2://tenant_abc/context/invocation/inv_123/transcript/2025-01-15_tr_789.json" }}Why not inline? Facts are immutable and replicated. Embedding large blobs would bloat the ledger, slow replication, and waste storage on data that rarely needs the same durability guarantees as economic Facts.
2. Fact Archives
Section titled “2. Fact Archives”Cold storage for Facts that are past their hot query window.
When Facts Move to Archive:
- Facts older than retention window (default: 90 days)
- Tenant explicitly requests archival
- System migration/consolidation
Archive Format:
{tenant_id}/archive/facts/{year}/{month}/{batch_id}.parquet
Example:tenant_abc/archive/facts/2024/06/batch_001.parquetWhy Parquet? Columnar format optimized for analytical queries. Compressed. Can be queried directly with tools like DuckDB without extraction.
Archive Lifecycle:
- Facts accumulate in D1 (hot storage)
- Archival Workflow identifies Facts past retention window
- Facts exported to Parquet, uploaded to R2
- Lifecycle Fact written recording the archival (see below)
- Facts removed from D1 (ledger reference in DO remains)
Important: The lifecycle Fact must be written BEFORE removing Facts from D1. This ensures the archival operation is auditable even if the removal fails partway through.
Fact { type: "lifecycle", subtype: "facts_archived", entity_id: "tenant_abc", data: { archive_url: "r2://tenant_abc/archive/facts/2024/06/batch_001.parquet", fact_count: 145678, date_range: { from: "2024-06-01", to: "2024-06-30" }, checksum: "sha256:abc123..." }}Invariant: Archived Facts are never deleted. The archive is append-only, just like the ledger itself.
3. Backup/Export Storage
Section titled “3. Backup/Export Storage”On-demand exports and system backups.
Export Types:
- Tenant data export (GDPR, offboarding)
- System backup snapshots
- Migration artifacts
Naming Convention:
{tenant_id}/export/{export_type}/{timestamp}_{export_id}.{ext}_system/backup/{backup_type}/{timestamp}_{backup_id}.{ext}
Examples:tenant_abc/export/full/2025-01-15_exp_001.zip_system/backup/d1/2025-01-15_bak_001.sqlitePresigned URLs
Section titled “Presigned URLs”R2 supports presigned URLs for secure, time-limited access without exposing credentials.
Use Cases
Section titled “Use Cases”| Scenario | URL Type | Expiry |
|---|---|---|
| Playback recording in UI | GET | 15 minutes |
| Download transcript | GET | 1 hour |
| Upload document | PUT | 15 minutes |
| Export download | GET | 24 hours |
Implementation Pattern
Section titled “Implementation Pattern”// Worker generates presigned URLasync function getRecordingUrl(tenant_id: string, recording_path: string): Promise<string> { // Verify caller has access to tenant // ...
const url = await env.R2_BUCKET.createSignedUrl(recording_path, { action: 'get', expiresIn: 900 // 15 minutes });
return url;}
// Client receives presigned URL, fetches directly from R2// No traffic through Workers for large file downloadsSecurity Notes:
- Never expose R2 bucket credentials to clients
- Always validate tenant access before generating presigned URLs
- Use shortest reasonable expiry times
- Log presigned URL generation (who, what, when)
Presigned URL Facts
Section titled “Presigned URL Facts”For auditability, presigned URL generation can be tracked:
Fact { type: "invocation", subtype: "presigned_url_generated", tenant_id: "tenant_abc", user_id: "user_123", data: { object_path: "tenant_abc/context/invocation/inv_123/recording/...", action: "get", expires_in_seconds: 900, purpose: "recording_playback" }}Lifecycle Policies
Section titled “Lifecycle Policies”R2 supports object lifecycle rules for automatic management.
z0 Lifecycle Policies
Section titled “z0 Lifecycle Policies”| Bucket/Prefix | Rule | Action |
|---|---|---|
*/context/invocation/* | Age > 90 days | Delete (unless archived) |
*/context/asset/* | Age > 365 days | Delete (unless flagged) |
*/archive/* | Never | No automatic deletion |
*/export/* | Age > 30 days | Delete |
_system/backup/* | Keep last 30 | Delete oldest beyond 30 |
Tenant Override
Section titled “Tenant Override”Tenants can request extended retention via billing Config:
Config { type: "billing", applies_to: "tenant_abc", settings: { recording_retention_days: 365, // Override default 90 export_retention_days: 90 // Override default 30 }}Lifecycle Workflow:
- Lifecycle Worker runs daily
- Queries objects approaching expiry
- Checks for tenant retention overrides
- Checks for hold flags (legal, compliance)
- Deletes eligible objects
- Records lifecycle Fact
Fact { type: "lifecycle", subtype: "objects_deleted", entity_id: "tenant_abc", data: { prefix: "tenant_abc/context/invocation/", objects_deleted: 1234, bytes_freed: 5678901234, policy: "default_90_day" }}When to Use R2 vs D1 vs DO Storage
Section titled “When to Use R2 vs D1 vs DO Storage”| Data Type | Storage | Rationale |
|---|---|---|
| Facts (hot) | DO ledger + D1 | Queryable, fast, need cross-entity joins |
| Facts (cold) | R2 archive | Bulk storage, rare access, compliance |
| Configs | DO + D1 | Versioned, queryable, frequently accessed |
| Entities | DO + D1 | Queryable, relationships matter |
| Cached State | DO memory/SQLite | Hot path, single-entity access |
| Recordings | R2 | Large blobs, served directly to clients |
| Transcripts | R2 | Large text, referenced from Facts |
| Documents | R2 | Variable size, user-uploaded content |
| Exports | R2 | Large, temporary, downloadable |
Decision Flowchart
Section titled “Decision Flowchart”Is it a blob > 1 KB?├── Yes → R2 (reference from Facts)└── No → Is it queryable across entities? ├── Yes → D1 └── No → Is it per-entity state? ├── Yes → DO (ledger or cache) └── No → Probably doesn't need storageSize Thresholds
Section titled “Size Thresholds”| Threshold | Guidance |
|---|---|
| < 1 KB | Can be in Fact.data if truly needed |
| 1 KB - 100 KB | R2, consider inline for critical data |
| 100 KB - 100 MB | R2, standard upload |
| > 100 MB | R2, multipart upload required |
Access Patterns
Section titled “Access Patterns”From Workers
Section titled “From Workers”// Bind R2 bucket in wrangler.toml// [[r2_buckets]]// binding = "CONTEXT_BUCKET"// bucket_name = "z0-context"
export default { async fetch(request, env) { // Read object const object = await env.CONTEXT_BUCKET.get("tenant_abc/context/..."); if (!object) return new Response("Not found", { status: 404 });
// Stream response return new Response(object.body, { headers: { "Content-Type": object.httpMetadata.contentType } }); }};From Durable Objects
Section titled “From Durable Objects”export class InvocationLedger extends DurableObject { async storeRecording(recording: ArrayBuffer, metadata: RecordingMetadata) { const path = `${this.tenant_id}/context/invocation/${this.id}/recording/${timestamp}_${recordingId}.mp3`;
await this.env.CONTEXT_BUCKET.put(path, recording, { httpMetadata: { contentType: "audio/mpeg" }, customMetadata: { tenant_id: this.tenant_id, invocation_id: this.id, duration_seconds: metadata.duration.toString() } });
return `r2://${path}`; }}Observability
Section titled “Observability”Metrics to Track
Section titled “Metrics to Track”z0_r2_operations_total{operation, bucket, tenant_id}z0_r2_bytes_written_total{bucket, tenant_id}z0_r2_bytes_read_total{bucket, tenant_id}z0_r2_presigned_urls_generated_total{action, purpose, tenant_id}z0_r2_lifecycle_deletions_total{policy, tenant_id}Alerts
Section titled “Alerts”| Condition | Severity | Action |
|---|---|---|
| Write failure rate > 1% | Critical | Page on-call |
| Storage growth > 20%/day | Warning | Review lifecycle policies |
| Presigned URL generation spike | Warning | Check for abuse |
| Archive job failure | Critical | Manual intervention required |
Cost Model
Section titled “Cost Model”R2 pricing (as of 2025):
| Component | Cost | Notes |
|---|---|---|
| Storage | $0.015/GB/month | First 10 GB free |
| Class A ops (write) | $4.50/million | PUT, POST, LIST |
| Class B ops (read) | $0.36/million | GET, HEAD |
| Egress | $0.00 | Zero, always |
z0 Cost Allocation:
- Context storage costs allocated to tenant via metadata
- Archive storage allocated to tenant
- System backups allocated to platform
Cost Tracking Pattern:
Fact { type: "cost", subtype: "storage", tool_id: "tool_r2", tenant_id: "tenant_abc", amount: 15.00, currency: "USD", data: { period: "2025-01", storage_gb: 1000, class_a_ops: 50000, class_b_ops: 500000 }}Summary
Section titled “Summary”| Question | Answer |
|---|---|
| What is R2? | S3-compatible object storage, zero egress |
| What does z0 store in R2? | Recordings, transcripts, documents, archives, exports |
| How are objects referenced? | r2:// URLs in Fact.data fields |
| How do users access objects? | Presigned URLs (never direct credentials) |
| When are objects deleted? | Lifecycle policies, tenant-configurable retention |
| R2 vs D1 vs DO? | R2 for blobs, D1 for queries, DO for per-entity state |
R2 extends z0’s storage capabilities to handle large payloads without compromising the ledger model. Facts remain lean and queryable; large context lives in R2, referenced by URL.