Architecture

The full picture — topology, request lifecycle, the data tiers, the consistency model, and the design decisions behind them.

Last updated:

This page is the map of how Lunora works and why it works that way. The first half is the architecture: the Cloudflare primitives, how a request flows through them, and the consistency guarantees you get. The second half is the design decisions — the forks in the road and why we took the branch we did.

If you only read one section, read The mental model. If you're evaluating Lunora against Convex or a hand-rolled Workers stack, jump to Design decisions.

Vocabulary

The rest of the page leans on these terms. Skim them once and the diagrams read themselves:

  • Shard — one Durable Object instance owning a slice of state. The default app has a single shard addressed as __root__; .shardBy(field) creates one shard per value of field.
  • Op-log (__cdc_log) — the append-only change log every shard keeps. Each committed write lands here in order; it's the source for deltas, pokes, and resume-after-reconnect.
  • Delta — a row change broadcast to subscribers of the legacy subscribe path (the DO re-runs the query and diffs).
  • Poke — the leaner local-first signal: a membership diff for a shape, computed from the op-log without re-running the query.
  • Shape — a named, parameterized partial replication of a table (a predicate
    • optional column projection) the client subscribes to by name.
  • Watermark — the last mutation id the server has applied for a given client; it makes optimistic writes idempotent and gap-detected.
  • Fan-out — a read that touches every shard of a .shardBy() table because it didn't pin one. Coordinated by the Worker; a cold path, not a hot one.
  • Hibernation — a DO with no traffic suspends, keeping its WebSocket subscriptions on the socket attachment. Idle subscribers cost storage, not CPU.

The mental model

A Lunora deployment is a single Worker that fronts a small set of Cloudflare primitives, wired together by @lunora/runtime. Each piece is single-purpose and replaceable:

                       Browser / framework client
                    (@lunora/client + react/vue/…)

                      HTTPS (RPC)  │  WebSocket (live)

            ┌────────────────────────────────────────────────┐
            │                  Worker (Edge)                  │
            │   @lunora/runtime · createWorker({ ... })       │
            │   RPC router · shard resolver · query coord.    │
            └──┬──────────┬──────────┬─────────┬─────────┬────┘
               │          │          │         │         │
        ┌──────┘    ┌─────┘     ┌────┘    ┌────┘    ┌────┘
        ▼           ▼           ▼         ▼         ▼
    ShardDO ×N  SchedulerDO     D1        R2      Queues
   (SQLite, WS) (alarms)    (.global())  (files) (background)
     Tier 1        async      Tier 2     Tier 3    async
        ▲                                             │
        │  ┌──────────────────────────────┐           │
        └──┤  Container DO ×N             │◄──────────┘
   RPC ────┤  (Docker · ffmpeg/Chrome/ML) │  actions call ctx.containers
  bridge   └──────────────────────────────┘

Three ideas carry the whole design:

  1. The Durable Object is the database. Non-global state lives in per-DO SQLite. Because a DO runs single-threaded and serializes its writes, you get a strong-consistency cut for free — no lock manager, no MVCC, no replication stream to tail.
  2. Real-time is a property of the store, not a bolt-on. The DO that wrote a row is also the one holding the WebSocket subscriptions for it, so a delta never crosses a process boundary to fan out.
  3. You scale by addressing, not rewriting. One DO by default; .shardBy() partitions across many; .global() promotes a table to D1. Your call sites don't change — codegen learns the new address.

The components

ComponentPackageResponsibility
Worker@lunora/runtimeEntry point. Parses the RPC envelope, authenticates, resolves the target shard, runs the query coordinator for fan-out, terminates WebSockets. Stateless — scales horizontally on Cloudflare's edge.
ShardDO@lunora/doThe workhorse. SQLite-backed state, OCC, serialized mutations, hibernatable WebSocket subscriptions, and the op-log (__cdc_log) that powers live sync. One per shard.
SessionDO@lunora/doPer-session coordination state (presence, auth session, transient fan-in).
SchedulerDO@lunora/schedulerAlarm-driven runAfter / runAt and Cron Triggers.
Container DO@lunora/containerA Docker workload running as a Durable Object — for jobs that don't fit the Workers runtime (ffmpeg, headless Chrome, an ML model, a long-running process). Reached from actions via ctx.containers.
D1@lunora/d1Backs .global() tables. Wraps the Sessions API (withSession(bookmark)) for read-your-writes across regional replicas.
R2@lunora/storageTyped buckets and signed URLs; the browser uploads/downloads directly, never proxied through the Worker.
Queues@lunora/queueFire-and-forget background work via a typed ctx.queues.<name> producer and a generated queue() consumer.
Client@lunora/client + adaptersBrowser SDK: multiplexed WebSocket, optimistic writes, offline queue, reconnect-by-bookmark. Framework adapters expose useQuery / useMutation / useSubscription.
Codegen@lunora/codegenReads schema.ts + your functions, emits _generated/{api,server,dataModel}.ts so the address of every shard and the type of every result is known at build time.

Request lifecycle

The four function kinds take four different paths through the topology.

A query (ctx.db.query(...))

The client calls api.messages.list. The Worker authenticates the RPC, asks the shard resolver for the DO that owns the data (the __root__ DO, or the one keyed by .shardBy()), and forwards the call. The DO runs the handler against its local SQLite — no network hop inside the read — applies the table's RLS read-where, and returns rows. If the query targets a .shardBy() table without pinning a shard, it becomes a fan-out: the Query Coordinator in the Worker dispatches to every shard and merges results.

A subscription (the same query, live)

useQuery opens one multiplexed WebSocket to the Worker, which terminates it against the resolving DO. The DO registers the subscription on the socket via state.serializeAttachment(...), so it survives hibernation. When a mutation later writes a matching row, the DO computes a delta and pokes only the sockets whose registered predicate matches. Idle subscribers pay for storage, not compute — which is why the bill drops to near-zero between bursts.

A mutation (ctx.db.insert/patch/...)

The Worker routes to the owning DO, which runs the handler inside blockConcurrencyWhile + a storage transaction. Writes are serialized — there is no concurrent writer to race, so there is no OCC-retry loop. Each write appends to the op-log (__cdc_log); on transaction commit the DO broadcasts deltas (legacy subscribe) and/or pokes (the local-first shape protocol) to its subscribers.

An action (ctx.fetch, third-party I/O)

Actions are the non-deterministic escape hatch: they run in the Worker (not the DO), may call external services, and may not touch ctx.db directly. They reach data by calling queries/mutations, which keeps the deterministic core deterministic. Long work belongs here (and in the scheduler), not in a mutation bound by the 30 s CPU envelope.

A request, end to end

The four paths above are easier to hold in your head as one story. Follow a single message from keypress to every other client's screen, in a chat sharded .shardBy("channelId"):

  1. Optimistic write. Alice hits send. useMutation(api.messages.send) applies an optimistic row locally — it renders instantly — and pushes the mutation over the multiplexed WebSocket, tagged with her clientId + next clientSeq.
  2. Route. The Worker authenticates the envelope and asks the shard resolver for the DO owning channelId: "general". One DO, one destination.
  3. Linearize. That ShardDO runs send inside blockConcurrencyWhile + a storage transaction. It checks the per-client watermark (replays are acked without re-running), inserts the row, and appends to the op-log — all in one serialized commit. This commit is the ordering everyone else will see.
  4. Poke. On commit, the DO reads the new op page once and, for each subscribed shape, computes membership with a single … IN (<changedIds>) AND <where> query. Sockets whose predicate matches channelId: "general" get a pokeStart → pokePart → pokeEnd; the rest get nothing.
  5. Apply. Bob's client applies the poke atomically at pokeEnd; TanStack DB re-derives his live view. The new row appears.
  6. Reconcile. Alice's client receives the same poke carrying her lastMutationId; it drops the optimistic overlay and the synced row replaces it with no flicker. Her watermark advances.

Every hard guarantee on this page shows up in those six steps: one serialized cut (3), real-time without a process hop (4), and gap-detected idempotent client state (1, 3, 6).

The data tiers

State lives in one of three tiers. The schema modifier on a table picks the tier; everything else — addressing, codegen, the client API — follows.

Tier 1 — Shard-local (ShardDO)

The default. Every table without a tier modifier lives in the __root__ DO. .shardBy(field) partitions the namespace so each value of field gets its own DO with its own SQLite, CPU budget, and hibernation timer.

  • Real-time stays in-process — the writer is the broadcaster.
  • Strongly consistent within a region — a DO is pinned near its creator.
  • Account-unlimited — millions of ShardDOs per account, zero admin.

The ceiling is 10 GB SQLite and ~1 000 req/s per DO. The runtime warns at 1 GB (10% of the ceiling) so a .shardBy() migration has runway. See Sharding and Limits.

Tier 2 — Global (D1)

.global() moves a table to Cloudflare D1 — for identities, billing, cross-tenant audit logs, anything that must be queryable across shards. Reads can be served by regional replicas; writes route to the primary. @lunora/d1 threads the withSession(bookmark) API so the client gets read-your-writes without a sticky-session router.

The price of .global() is the live-sync downgrade: a .global()-backed shape reads from D1 and is coordinator/poll-refreshed (latency-tiered), not poke-live — D1 has no per-DO op-log to diff. Keep hot, reactive data in Tier 1.

Tier 3 — Blob & async

  • R2 (@lunora/storage) — signed URLs; the browser transfers directly. Egress is free, so design around direct downloads.
  • Queues (@lunora/queue) — ctx.queues.<name>.send(...) from a mutation or action for fire-and-forget work. A generated queue() consumer drains it. Idempotency is on you.
  • KV — edge-cached config and feature flags via @lunora/bindings/kv (ctx.kv).

Compute tiers

Data has three tiers; so does compute. Each function runs in the cheapest place that can do the job, and you escalate only when the work demands it.

TierWhereGood forBounded by
WorkerStateless edgeRPC routing, auth, fan-out coordination, actions (external I/O)30 s CPU, 10 MB bundle, no state
ShardDOStateful, serializedReads, writes, live subscriptions — the deterministic core30 s CPU, 10 GB SQLite, ~1 000 req/s
ContainerFull Linux VM (Docker)Native binaries, full filesystem, long-running or heavyweight workPaid plan, linux/amd64, ephemeral disk

Containers — escaping the Workers runtime

Some jobs simply don't fit a Worker: an existing binary (ffmpeg, headless Chrome), a Python ML model, anything that wants a real filesystem or runs longer than the request envelope. @lunora/container runs those as Cloudflare Containers, and the key architectural point is how they connect back to the rest of the system.

  • A container is a Durable Object. defineContainer(...) makes codegen emit a container-enabled DO class; lunora dev / deploy reconcile the wrangler.jsonc binding, image build, and SQLite migration. It sits alongside ShardDO in the same topology, not on a separate platform.
  • Reached only from actions. Like ctx.fetch and ctx.ai, ctx.containers is external I/O, so it lives on actions, never queries or mutations — keeping the deterministic core deterministic. Routing mirrors the data tiers: .get(name) pins one instance per entity (stateful), .any() load-balances a fixed pool (stateless), and .pool() adds retry/backoff over a cold instance.
  • It reads data through RPC, not the database. Container code calls back into your app with the bridge client (@lunora/container/bridge), over the Worker's HTTP RPC endpoint (/_lunora/rpc), authenticated by a bearer your resolveIdentity validates. So a container reads and writes app state through the same queries and mutations the browser uses — it never reaches into SQLite or D1 directly. The serialized DO stays the single linearization point, even for a heavyweight sidecar.
  • Idle is free, disk is not durable. Instances scale to zero on sleepAfter (active-CPU billing); the local disk is ephemeral, so persistence goes to R2 via @lunora/storage. Egress is billed, unlike R2.
   action ──ctx.containers.transcoder.get(id).fetch()──►  Container DO
      ▲                                                        │
      └──────────  bridge: POST /_lunora/rpc  ◄────────────────┘
                   (runs your query/mutation as a verified identity)

See @lunora/container for the full surface, and Limits for the ceilings.

The consistency model

This is the part worth internalizing before you design a schema. Each scope below shows its guarantee (the badge) and the mechanism that provides it:

Prop

Type

The hard edge: there is no cross-shard transaction and no cross-shard live join. A shape whose where() joins two .shardBy() tables on different DOs is rejected at registration — there's no single serialized cut to diff across. The two supported answers are denormalize into the shard, or promote the joined table to .global(). See Local-first sync.

The real-time engine

Two protocols ride the same hibernatable WebSocket.

  • Legacy subscribe — re-runs the query and diffs the result. Simple, always correct, heavier on the DO.
  • The poke diff protocol (local-first) — the DO reads the op-log page once per write flush, computes per-shape membership with a single … IN (<changedIds>) AND <effectiveWhere> query, and pokes each socket the membership diff. The client maintains the live views through TanStack DB's incremental dataflow; the DO never runs a dataflow pipeline. Optimistic custom mutators rebase over the authoritative result on each poke. See Real-time and Local-first sync.

Reconnect is resume-by-bookmark: the client sends the last sequence it acknowledged and the server replays the gap. A subscriber that has fallen behind the op-log retention window is forced to re-seed rather than silently miss rows.

Failure modes

The happy path is above; here is what the architecture does when things break. The theme is fail closed and re-derive, never silently diverge.

  • DO evicted mid-request. A mutation is one storage transaction, so an eviction either commits the whole thing or none of it — there is no half-written row. The Worker surfaces the error to the caller; the client's optimistic overlay stays pending and re-pushes (idempotently, by watermark) on reconnect.
  • WebSocket drops mid-poke. Pokes apply atomically at pokeEnd, so a socket that dies between pokeStart and pokeEnd simply never applied that batch. On reconnect the checkpoint/epoch won't match and the client re-seeds the shape rather than stitching a partial diff.
  • Subscriber falls behind op-log retention. If a client's sinceCheckpoint is older than the shard's minCdcSeq, the diff it needs is gone. The DO refuses to guess and forces a full re-seed — a fallen-behind client never silently misses rows.
  • Duplicate / out-of-order client push. The watermark is the guard: seq <= watermark is acked without re-running (safe replay after a flaky ack); seq > watermark + 1 is rejected 409 OUT_OF_ORDER and the client resends from watermark + 1. Exactly-once effect, at-least-once delivery.
  • D1 replica lag. A .global() read without the caller's bookmark may hit a stale replica. Threading the x-d1-bookmark from the last write restores read-your-writes; cross-replica convergence is otherwise bounded by the ~24 h window.
  • Container crash / cold start. .get(name) surfaces the error to the action; .pool() rides over it by retrying on a freshly-picked instance with backoff. Because containers reach data only through RPC, a crashed container can never leave storage half-written — the DO transaction is still the only writer.
  • Hot shard (approaching the ceiling). Not a crash but a capacity wall: the runtime warns at 1 GB / sustained >700 req/s well before the hard limit, so a .shardBy() migration has runway. See Limits.

Design decisions

Each decision below is the same shape: what we chose, why, and what it costs. These are the trade-offs that distinguish Lunora from the alternatives.

Durable Objects as the database, not D1-as-truth

Decision. Primary state is per-DO SQLite, not a central D1 (or Postgres) database with the DO as a cache.

Why. A DO is single-threaded and serializes its writes, so the order of commits is the consistency cut — for free. That single property buys three things at once: strong consistency without a lock manager, an authoritative op-log (__cdc_log) without a logical-replication stream to tail, and real-time fan-out without a process hop (the writer holds the sockets). Hibernation makes idle cost ≈ $0. A D1-as-truth design needs a version-counter for OCC and a version-snapshot engine to rebuild reactive state — machinery the DO topology makes unnecessary.

Cost. No cross-shard transactions; cross-shard reads are fan-out; the 10 GB / ~1 000 req/s per-DO ceiling is real. We answer it with opt-in sharding rather than a bigger central database.

One DO by default, opt-in sharding

Decision. New apps get a single __root__ DO holding every table. Scaling out is a per-table .shardBy(field) edit, not an architecture.

Why. The first 80% of an app's life fits in one DO, and a single serialized store is the easiest thing in the world to reason about. Sharding should be a decision you make when the data tells you to (the 1 GB warning), against the same call sites — codegen re-addresses ctx.db.messages.*, your handler doesn't change.

Cost. Codegen has to know how to address shards, and fan-out reads across an unpinned .shardBy() table are expensive — a Paid-plan, cold-path operation, not a hot one. The single-DO start also means Lunora's default throughput ceiling looks lower than a sharded-from-day-one system until you opt in.

Serialized mutations, no OCC-retry loop

Decision. Mutations run inside blockConcurrencyWhile + a storage transaction. We deliberately do not ship an optimistic-retry loop.

Why. Because the DO serializes, there is no concurrent writer to race. A ConflictError is therefore a deterministic self-conflict (a handler's own trigger or cascade contradicting its own write), which a retry would hit again identically. Retrying it is useless at best and an infinite loop at worst.

Cost. Throughput per DO is bounded by serial execution — another reason the escape valve is sharding, not concurrency within a shard.

Hibernated WebSockets, no external broker

Decision. Real-time fan-out is entirely DO-based: hibernatable WebSocket subscriptions stored on the socket attachment. No MQTT, no Pub/Sub, no Redis, no external message broker.

Why. The subscriptions live exactly where the writes happen, so a delta never leaves the process. Hibernation drops idle compute to zero. And it's type-safe end-to-end with your queries — there's no second schema for an external broker to drift from. Cloudflare Pub/Sub adds only native-MQTT device ingest over this path, a narrow niche, and is itself beta with no Worker binding.

Cost. Fan-out is bounded by the per-DO WebSocket ceiling (~32 000 hibernated). Truly account-wide broadcast (every user, one event) is not the sweet spot — that's a .global() + poll-refresh shape, not a poke.

D1 for global tables, with Sessions-API read-your-writes

Decision. .global() tables live in D1, and the client threads a D1 session bookmark for consistency rather than a sticky-session router.

Why. Some data is inherently cross-tenant (users, billing) and can't live in one tenant's shard. D1 gives regional read replicas for low-latency reads; the Sessions API bookmark gives read-your-writes without pinning a user to a region.

Cost. Global tables are eventually consistent across replicas (bounded by the ~24 h bookmark window) and are not poke-live. It's the right tier for the slow-changing spine of an app, the wrong one for a hot feed.

Vite-first DX and build-time codegen

Decision. A Vite plugin owns codegen, the dev server, and server↔client type sync. The schema and functions are the single source of truth; everything else is generated.

Why. End-to-end type safety only holds if there's one source of truth and the toolchain enforces it. Generating _generated/* at build time means a renamed field breaks the build in your editor, not in production — and the shard address is resolved at codegen, so the client never guesses.

Cost. Codegen is a required step in the loop (dist/ is built on demand, and stale _generated can bite). The dev server boots workerd — the real runtime — rather than a lightweight mock, trading a little startup time for fidelity.

Functional procedure builders (query / mutation / action)

Decision. Three explicit function kinds with a chainable builder, not a single generic "handler." Queries are deterministic reads, mutations are deterministic serialized writes, actions are the I/O escape hatch.

Why. The kind is the contract. Queries can be cached, subscribed, and fanned out because they're deterministic; mutations can broadcast because they're the linearization point; actions are quarantined from ctx.db so the deterministic core stays deterministic. Splitting them lets the runtime apply the right machinery to each.

Cost. You have to know which kind you're writing, and "call a mutation from an action" is a deliberate hop rather than a free function call. (Note: query / mutation determinism is a convention the advisor lints, not yet a hard runtime guard.)

Batteries as separate installs, not a bundled framework

Decision. Auth, mail, storage, scheduler, payments, AI, containers, queues, workflows — each is its own @lunora/* package and an opt-in ctx.* facade. The umbrella lunorash package bundles only the base.

Why. The Worker bundle ceiling is 10 MB; an app shouldn't pay bundle size for a provider it never calls. Separate installs also mean no hard dependency on, say, a specific auth library — @lunora/auth is swappable, the core doesn't know it exists. Batteries, not lock-in.

Cost. More packages to discover and wire than a monolith. The lunorash umbrella and vis generate scaffolding exist to smooth that over.

Containers reach data through RPC, not the database

Decision. A container never connects to SQLite or D1. It calls back into the app over the HTTP RPC bridge, running the same queries and mutations the browser does, under an identity the Worker's resolveIdentity verifies.

Why. If a heavyweight sidecar could write storage directly, it would bypass the serialized DO — the one property the whole consistency model rests on — and every RLS policy and validator with it. Forcing containers through the same front door keeps a single linearization point and one authorization surface, no matter how exotic the workload behind it is.

Cost. A container pays an HTTP hop to read or write app state, rather than a local query. For the data-heavy inner loop, hand the container the data in the action's fetch body instead of round-tripping per row.

Non-goals

Saying no is part of the architecture. Lunora deliberately does not:

  • run an external message broker (MQTT / Pub/Sub / Redis) — see above;
  • offer cross-shard transactions or cross-shard live joins — denormalize or .global() instead;
  • provide an OCC-retry loop on mutations — serialized writes make it pointless;
  • enforce query/mutation determinism at runtime yet — it's an advisor lint today;
  • ship a managed control plane as a hard dependency — self-host is the default; Lunora Cloud is optional and runs the same code.

Where the seams are

Every tier is replaceable, which is the point of keeping them single-purpose:

  • Storage backendShardDO is a base class; the SQLite layer is behind @lunora/do.
  • Global adapter@lunora/d1 implements a store interface; other global backends can slot in.
  • Auth store@lunora/auth takes any AuthStore; D1 is the default, not a requirement.
  • Framework client — the @lunora/client core is framework-agnostic; react/vue/solid/svelte/astro/nuxt are thin adapters.
  • Add-on bindingsctx.kv, ctx.images, ctx.ai, ctx.sql, … are facades you opt into per package.

See also