Analytics Engine
High-cardinality time-series metrics for real-time dashboards and operational visibility.
Prerequisites: PRINCIPLES.md, PRIMITIVES.md, observability.md
Overview
Section titled “Overview”Analytics Engine is Cloudflare’s high-cardinality time-series database. It handles millions of unique metric combinations (tenant_id x asset_id x campaign_id x fact_type) without the cardinality explosions that break traditional metrics systems.
z0 uses Analytics Engine for:
- Real-time dashboards (spend velocity, outcome rates)
- Economic metrics (revenue, cost, margin by tenant/asset/campaign)
- Operational metrics (latency percentiles, error rates, cache hit rates)
- Per-entity drill-down (why is this tenant’s error rate high?)
Key Insight: Facts are authoritative business events (Principle 2: immutable). Analytics Engine captures operational measurements about those events. You query Facts for “what happened,” you query Analytics Engine for “how fast, how often, how much.”
Data Model
Section titled “Data Model”Every data point in Analytics Engine has three components:
| Component | What It Holds | Limits |
|---|---|---|
| Indexes | Primary dimension for grouping (e.g., tenant_id) | 1 index, string |
| Blobs | Additional string dimensions for filtering | Up to 20 blobs |
| Doubles | Numeric values for aggregation | Up to 20 doubles |
z0 Schema
Section titled “z0 Schema”// Writing a data pointenv.METRICS.writeDataPoint({ indexes: [tenant_id], blobs: [ asset_id, // blob1 campaign_id, // blob2 fact_type, // blob3: invocation, outcome, charge, cost subtype, // blob4: call_qualified, deal_won, etc. tool_id, // blob5 error_code // blob6: null for success ], doubles: [ amount_cents, // double1: economic value duration_ms, // double2: latency 1 // double3: count (always 1 per event) ]});Dimension Strategy
Section titled “Dimension Strategy”| Dimension | Storage | Rationale |
|---|---|---|
| tenant_id | index | Primary grouping, always filtered |
| asset_id | blob1 | Per-asset drill-down |
| campaign_id | blob2 | Per-campaign drill-down |
| fact_type | blob3 | Filter by event category |
| subtype | blob4 | Filter by specific event |
| tool_id | blob5 | Tool-level cost analysis |
| error_code | blob6 | Error pattern analysis |
How z0 Uses Analytics Engine
Section titled “How z0 Uses Analytics Engine”Real-Time Dashboards
Section titled “Real-Time Dashboards”Dashboards query Analytics Engine for time-series aggregations:
-- Spend velocity (last hour, per tenant)SELECT intDiv(toUInt32(timestamp), 300) AS bucket, SUM(double1) / 100 AS spend_dollarsFROM METRICSWHERE index1 = 'tenant_abc' AND blob3 = 'charge' AND timestamp > now() - INTERVAL 1 HOURGROUP BY bucketORDER BY bucketEconomic Metrics
Section titled “Economic Metrics”Track revenue, cost, and margin at any granularity:
-- Daily margin by campaignSELECT blob2 AS campaign_id, SUM(CASE WHEN blob3 = 'charge' THEN double1 ELSE 0 END) AS revenue_cents, SUM(CASE WHEN blob3 = 'cost' THEN double1 ELSE 0 END) AS cost_cents, revenue_cents - cost_cents AS margin_centsFROM METRICSWHERE index1 = 'tenant_abc' AND timestamp > now() - INTERVAL 1 DAYGROUP BY campaign_idOperational Metrics
Section titled “Operational Metrics”Monitor system health and performance:
-- P50, P95, P99 latency by toolSELECT blob5 AS tool_id, quantile(0.50)(double2) AS p50_ms, quantile(0.95)(double2) AS p95_ms, quantile(0.99)(double2) AS p99_ms, COUNT(*) AS invocation_countFROM METRICSWHERE blob3 = 'invocation' AND timestamp > now() - INTERVAL 1 HOURGROUP BY tool_idPer-Tenant/Per-Entity Metrics
Section titled “Per-Tenant/Per-Entity Metrics”High-cardinality support means every tenant, asset, and campaign gets its own metrics without pre-aggregation:
-- Error rate by asset for a specific tenantSELECT blob1 AS asset_id, SUM(IF(blob6 IS NOT NULL, 1, 0)) AS errors, COUNT(*) AS total, errors / total AS error_rateFROM METRICSWHERE index1 = 'tenant_abc' AND timestamp > now() - INTERVAL 1 HOURGROUP BY asset_idHAVING total > 100ORDER BY error_rate DESCLIMIT 10SQL API
Section titled “SQL API”Analytics Engine exposes a SQL API for querying. Queries are sent via HTTP to the Cloudflare API.
Query Endpoint
Section titled “Query Endpoint”curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql" \ -H "Authorization: Bearer {api_token}" \ -H "Content-Type: application/json" \ -d '{"query": "SELECT ... FROM METRICS WHERE ..."}'From Workers
Section titled “From Workers”// Query from a Workerconst response = await fetch( `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/analytics_engine/sql`, { method: 'POST', headers: { 'Authorization': `Bearer ${API_TOKEN}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ query: ` SELECT blob3 AS fact_type, SUM(double3) AS count FROM METRICS WHERE index1 = '${tenantId}' AND timestamp > now() - INTERVAL 1 HOUR GROUP BY fact_type ` }) });Supported SQL Features
Section titled “Supported SQL Features”| Feature | Supported | Notes |
|---|---|---|
| SELECT, WHERE, GROUP BY | Yes | Standard aggregation queries |
| ORDER BY, LIMIT | Yes | Result ordering and pagination |
| Aggregations | Yes | SUM, COUNT, AVG, MIN, MAX, quantile |
| Time functions | Yes | now(), INTERVAL, toDateTime |
| Joins | No | Single table only |
| Subqueries | Limited | Simple subqueries only |
Retention and Sampling
Section titled “Retention and Sampling”Retention
Section titled “Retention”| Tier | Retention Period |
|---|---|
| Standard | 90 days |
Data older than 90 days is automatically deleted. For longer retention, aggregate and store in D1.
Sampling
Section titled “Sampling”Analytics Engine may sample high-volume data points. The _sample_interval column indicates sampling ratio:
-- Account for sampling in aggregationsSELECT SUM(double3 * _sample_interval) AS estimated_countFROM METRICSWHERE index1 = 'tenant_abc'z0 guidance: For economic calculations (revenue, cost), always use Facts in D1 as the source of truth. Analytics Engine is for operational visibility, not financial reporting.
Why this matters (Principle 4 - Economics Must Close the Loop): Economic truth requires complete, immutable, auditable records. Analytics Engine may sample data, has 90-day retention, and lacks the append-only guarantees of the Fact ledger. You cannot close the economic loop (“was this worth it?”) from sampled metrics. Always derive financial reports from Facts.
When to Use What
Section titled “When to Use What”| Question | Data Source | Why |
|---|---|---|
| What charges occurred? | Facts (D1) | Authoritative, immutable (Principle 2) |
| How fast are charges growing? | Analytics Engine | Time-series aggregation |
| What’s the p99 latency? | Analytics Engine | Operational metrics |
| Why did this charge fail? | Traces (Workers Trace Events) | Execution details |
| What’s the exact revenue? | Facts (D1) | Financial accuracy |
| What’s the spend velocity? | Analytics Engine | Real-time trends |
| Is this tenant healthy? | Analytics Engine | Operational health |
Analytics Engine vs Logs vs D1
Section titled “Analytics Engine vs Logs vs D1”| Characteristic | Analytics Engine | Workers Logs | D1 |
|---|---|---|---|
| Data type | Numeric metrics | Text events | Structured records |
| Cardinality | High (millions) | Medium | High |
| Retention | 90 days | 3 days | Permanent |
| Query latency | < 100ms | Seconds | < 100ms |
| Aggregation | Native (SUM, AVG, quantile) | None | SQL |
| Use case | Dashboards, alerts | Debugging | Reporting, audit |
Decision tree:
- Need authoritative business data? -> D1 (Facts)
- Need real-time aggregations? -> Analytics Engine
- Need execution debugging? -> Workers Trace Events
- Need text search in logs? -> Workers Logs
Writing Data Points
Section titled “Writing Data Points”From Durable Objects
Section titled “From Durable Objects”Write metrics when processing Facts:
// In DO fact handlerasync appendFact(fact) { // Write to ledger (authoritative) await this.sql.exec( 'INSERT INTO facts ...', [fact.id, fact.type, ...] );
// Write to Analytics Engine (operational) this.env.METRICS.writeDataPoint({ indexes: [fact.tenant_id], blobs: [ fact.asset_id, fact.campaign_id, fact.type, fact.subtype, fact.tool_id, null // no error ], doubles: [ fact.amount || 0, fact.data?.duration_ms || 0, 1 ] });}Error Tracking
Section titled “Error Tracking”Write error metrics for observability:
// On errorthis.env.METRICS.writeDataPoint({ indexes: [tenant_id], blobs: [asset_id, campaign_id, operation, null, tool_id, error_code], doubles: [0, duration_ms, 1]});Batching
Section titled “Batching”writeDataPoint is non-blocking. Cloudflare batches writes automatically. No manual batching needed.
Configuration
Section titled “Configuration”Wrangler Setup
Section titled “Wrangler Setup”[[analytics_engine_datasets]]binding = "METRICS"dataset = "z0_metrics"Multiple Datasets
Section titled “Multiple Datasets”Use separate datasets for different metric categories if needed:
[[analytics_engine_datasets]]binding = "ECONOMIC_METRICS"dataset = "z0_economic"
[[analytics_engine_datasets]]binding = "OPERATIONAL_METRICS"dataset = "z0_operational"Limits
Section titled “Limits”| Limit | Value |
|---|---|
| Indexes per data point | 1 |
| Blobs per data point | 20 |
| Doubles per data point | 20 |
| Blob size | 256 bytes each |
| Writes per second | 25 per Worker invocation |
| Query timeout | 10 seconds |
| Result rows | 10,000 max |
Dashboard Integration
Section titled “Dashboard Integration”Update Frequency
Section titled “Update Frequency”| Metric Type | Refresh Interval | Source |
|---|---|---|
| Live budget | Real-time (WebSocket) | Durable Object |
| Spend velocity | 5 seconds | Analytics Engine |
| Error rates | 10 seconds | Analytics Engine |
| Outcome counts | 5 seconds | Analytics Engine |
| Historical trends | 1 minute | Analytics Engine |
Query Patterns for Dashboards
Section titled “Query Patterns for Dashboards”-- Current hour summarySELECT SUM(IF(blob3 = 'invocation', double3, 0)) AS invocations, SUM(IF(blob3 = 'outcome', double3, 0)) AS outcomes, SUM(IF(blob3 = 'charge', double1, 0)) / 100 AS revenue, SUM(IF(blob3 = 'cost', double1, 0)) / 100 AS costFROM METRICSWHERE index1 = :tenant_id AND timestamp > now() - INTERVAL 1 HOUR
-- Time series for sparkline (5-minute buckets)SELECT intDiv(toUInt32(timestamp), 300) * 300 AS bucket, SUM(double3) AS countFROM METRICSWHERE index1 = :tenant_id AND blob3 = 'invocation' AND timestamp > now() - INTERVAL 6 HOURGROUP BY bucketORDER BY bucketSummary
Section titled “Summary”| Aspect | Details |
|---|---|
| Purpose | High-cardinality operational metrics |
| Data model | 1 index + 20 blobs + 20 doubles |
| Query language | SQL via HTTP API |
| Retention | 90 days |
| Latency | < 100ms query, non-blocking writes |
| Use for | Dashboards, alerts, operational visibility |
| Don’t use for | Financial reporting, audit trails (use Facts) |
Analytics Engine answers “how is the system performing?” Facts answer “what happened?” You need both for complete observability.