Architecture
The full picture — topology, request lifecycle, the data tiers, the consistency model, and the design decisions behind them.
Last updated:
This page is the map of how Lunora works and why it works that way. The first half is the architecture: the Cloudflare primitives, how a request flows through them, and the consistency guarantees you get. The second half is the design decisions — the forks in the road and why we took the branch we did.
If you only read one section, read The mental model. If you're evaluating Lunora against Convex or a hand-rolled Workers stack, jump to Design decisions.
Vocabulary
The rest of the page leans on these terms. Skim them once and the diagrams read themselves:
- Shard — one Durable Object instance owning a slice of state. The default
app has a single shard addressed as
__root__;.shardBy(field)creates one shard per value offield. - Op-log (
__cdc_log) — the append-only change log every shard keeps. Each committed write lands here in order; it's the source for deltas, pokes, and resume-after-reconnect. - Delta — a row change broadcast to subscribers of the legacy
subscribepath (the DO re-runs the query and diffs). - Poke — the leaner local-first signal: a membership diff for a shape, computed from the op-log without re-running the query.
- Shape — a named, parameterized partial replication of a table (a predicate
- optional column projection) the client subscribes to by name.
- Watermark — the last mutation id the server has applied for a given client; it makes optimistic writes idempotent and gap-detected.
- Fan-out — a read that touches every shard of a
.shardBy()table because it didn't pin one. Coordinated by the Worker; a cold path, not a hot one. - Hibernation — a DO with no traffic suspends, keeping its WebSocket subscriptions on the socket attachment. Idle subscribers cost storage, not CPU.
The mental model
A Lunora deployment is a single Worker that fronts a small set of
Cloudflare primitives, wired together by @lunora/runtime. Each piece is
single-purpose and replaceable:
Browser / framework client
(@lunora/client + react/vue/…)
│
HTTPS (RPC) │ WebSocket (live)
▼
┌────────────────────────────────────────────────┐
│ Worker (Edge) │
│ @lunora/runtime · createWorker({ ... }) │
│ RPC router · shard resolver · query coord. │
└──┬──────────┬──────────┬─────────┬─────────┬────┘
│ │ │ │ │
┌──────┘ ┌─────┘ ┌────┘ ┌────┘ ┌────┘
▼ ▼ ▼ ▼ ▼
ShardDO ×N SchedulerDO D1 R2 Queues
(SQLite, WS) (alarms) (.global()) (files) (background)
Tier 1 async Tier 2 Tier 3 async
▲ │
│ ┌──────────────────────────────┐ │
└──┤ Container DO ×N │◄──────────┘
RPC ────┤ (Docker · ffmpeg/Chrome/ML) │ actions call ctx.containers
bridge └──────────────────────────────┘Three ideas carry the whole design:
- The Durable Object is the database. Non-global state lives in per-DO SQLite. Because a DO runs single-threaded and serializes its writes, you get a strong-consistency cut for free — no lock manager, no MVCC, no replication stream to tail.
- Real-time is a property of the store, not a bolt-on. The DO that wrote a row is also the one holding the WebSocket subscriptions for it, so a delta never crosses a process boundary to fan out.
- You scale by addressing, not rewriting. One DO by default;
.shardBy()partitions across many;.global()promotes a table to D1. Your call sites don't change — codegen learns the new address.
The components
| Component | Package | Responsibility |
|---|---|---|
| Worker | @lunora/runtime | Entry point. Parses the RPC envelope, authenticates, resolves the target shard, runs the query coordinator for fan-out, terminates WebSockets. Stateless — scales horizontally on Cloudflare's edge. |
| ShardDO | @lunora/do | The workhorse. SQLite-backed state, OCC, serialized mutations, hibernatable WebSocket subscriptions, and the op-log (__cdc_log) that powers live sync. One per shard. |
| SessionDO | @lunora/do | Per-session coordination state (presence, auth session, transient fan-in). |
| SchedulerDO | @lunora/scheduler | Alarm-driven runAfter / runAt and Cron Triggers. |
| Container DO | @lunora/container | A Docker workload running as a Durable Object — for jobs that don't fit the Workers runtime (ffmpeg, headless Chrome, an ML model, a long-running process). Reached from actions via ctx.containers. |
| D1 | @lunora/d1 | Backs .global() tables. Wraps the Sessions API (withSession(bookmark)) for read-your-writes across regional replicas. |
| R2 | @lunora/storage | Typed buckets and signed URLs; the browser uploads/downloads directly, never proxied through the Worker. |
| Queues | @lunora/queue | Fire-and-forget background work via a typed ctx.queues.<name> producer and a generated queue() consumer. |
| Client | @lunora/client + adapters | Browser SDK: multiplexed WebSocket, optimistic writes, offline queue, reconnect-by-bookmark. Framework adapters expose useQuery / useMutation / useSubscription. |
| Codegen | @lunora/codegen | Reads schema.ts + your functions, emits _generated/{api,server,dataModel}.ts so the address of every shard and the type of every result is known at build time. |
Request lifecycle
The four function kinds take four different paths through the topology.
A query (ctx.db.query(...))
The client calls api.messages.list. The Worker authenticates the RPC, asks the
shard resolver for the DO that owns the data (the __root__ DO, or the one
keyed by .shardBy()), and forwards the call. The DO runs the handler against
its local SQLite — no network hop inside the read — applies the table's RLS
read-where, and returns rows. If the query targets a .shardBy() table
without pinning a shard, it becomes a fan-out: the Query Coordinator in the
Worker dispatches to every shard and merges results.
A subscription (the same query, live)
useQuery opens one multiplexed WebSocket to the Worker, which terminates it
against the resolving DO. The DO registers the subscription on the socket via
state.serializeAttachment(...), so it survives hibernation. When a
mutation later writes a matching row, the DO computes a delta and pokes only the
sockets whose registered predicate matches. Idle subscribers pay for storage,
not compute — which is why the bill drops to near-zero between bursts.
A mutation (ctx.db.insert/patch/...)
The Worker routes to the owning DO, which runs the handler inside
blockConcurrencyWhile + a storage transaction. Writes are serialized —
there is no concurrent writer to race, so there is no OCC-retry loop. Each write
appends to the op-log (__cdc_log); on transaction commit the DO broadcasts
deltas (legacy subscribe) and/or pokes (the local-first shape protocol) to its
subscribers.
An action (ctx.fetch, third-party I/O)
Actions are the non-deterministic escape hatch: they run in the Worker (not
the DO), may call external services, and may not touch ctx.db directly.
They reach data by calling queries/mutations, which keeps the deterministic
core deterministic. Long work belongs here (and in the scheduler), not in a
mutation bound by the 30 s CPU envelope.
A request, end to end
The four paths above are easier to hold in your head as one story. Follow a
single message from keypress to every other client's screen, in a chat sharded
.shardBy("channelId"):
- Optimistic write. Alice hits send.
useMutation(api.messages.send)applies an optimistic row locally — it renders instantly — and pushes the mutation over the multiplexed WebSocket, tagged with herclientId+ nextclientSeq. - Route. The Worker authenticates the envelope and asks the shard resolver
for the DO owning
channelId: "general". One DO, one destination. - Linearize. That
ShardDOrunssendinsideblockConcurrencyWhile+ a storage transaction. It checks the per-client watermark (replays are acked without re-running), inserts the row, and appends to the op-log — all in one serialized commit. This commit is the ordering everyone else will see. - Poke. On commit, the DO reads the new op page once and, for each subscribed
shape, computes membership with a single
… IN (<changedIds>) AND <where>query. Sockets whose predicate matcheschannelId: "general"get apokeStart → pokePart → pokeEnd; the rest get nothing. - Apply. Bob's client applies the poke atomically at
pokeEnd; TanStack DB re-derives his live view. The new row appears. - Reconcile. Alice's client receives the same poke carrying her
lastMutationId; it drops the optimistic overlay and the synced row replaces it with no flicker. Her watermark advances.
Every hard guarantee on this page shows up in those six steps: one serialized cut (3), real-time without a process hop (4), and gap-detected idempotent client state (1, 3, 6).
The data tiers
State lives in one of three tiers. The schema modifier on a table picks the tier; everything else — addressing, codegen, the client API — follows.
Tier 1 — Shard-local (ShardDO)
The default. Every table without a tier modifier lives in the __root__ DO.
.shardBy(field) partitions the namespace so each value of field gets its own
DO with its own SQLite, CPU budget, and hibernation timer.
- Real-time stays in-process — the writer is the broadcaster.
- Strongly consistent within a region — a DO is pinned near its creator.
- Account-unlimited — millions of
ShardDOs per account, zero admin.
The ceiling is 10 GB SQLite and ~1 000 req/s per DO. The runtime warns at
1 GB (10% of the ceiling) so a .shardBy() migration has runway. See
Sharding and Limits.
Tier 2 — Global (D1)
.global() moves a table to Cloudflare D1 — for identities, billing,
cross-tenant audit logs, anything that must be queryable across shards. Reads
can be served by regional replicas; writes route to the primary. @lunora/d1
threads the withSession(bookmark) API so the client gets read-your-writes
without a sticky-session router.
The price of .global() is the live-sync downgrade: a .global()-backed shape reads from D1 and is coordinator/poll-refreshed (latency-tiered), not
poke-live — D1 has no per-DO op-log to diff. Keep hot, reactive data in Tier 1.
Tier 3 — Blob & async
- R2 (
@lunora/storage) — signed URLs; the browser transfers directly. Egress is free, so design around direct downloads. - Queues (
@lunora/queue) —ctx.queues.<name>.send(...)from a mutation or action for fire-and-forget work. A generatedqueue()consumer drains it. Idempotency is on you. - KV — edge-cached config and feature flags via
@lunora/bindings/kv(ctx.kv).
Compute tiers
Data has three tiers; so does compute. Each function runs in the cheapest place that can do the job, and you escalate only when the work demands it.
| Tier | Where | Good for | Bounded by |
|---|---|---|---|
| Worker | Stateless edge | RPC routing, auth, fan-out coordination, actions (external I/O) | 30 s CPU, 10 MB bundle, no state |
| ShardDO | Stateful, serialized | Reads, writes, live subscriptions — the deterministic core | 30 s CPU, 10 GB SQLite, ~1 000 req/s |
| Container | Full Linux VM (Docker) | Native binaries, full filesystem, long-running or heavyweight work | Paid plan, linux/amd64, ephemeral disk |
Containers — escaping the Workers runtime
Some jobs simply don't fit a Worker: an existing binary (ffmpeg, headless
Chrome), a Python ML model, anything that wants a real filesystem or runs longer
than the request envelope. @lunora/container runs those as
Cloudflare Containers, and the
key architectural point is how they connect back to the rest of the system.
- A container is a Durable Object.
defineContainer(...)makes codegen emit a container-enabled DO class;lunora dev/deployreconcile thewrangler.jsoncbinding, image build, and SQLite migration. It sits alongsideShardDOin the same topology, not on a separate platform. - Reached only from actions. Like
ctx.fetchandctx.ai,ctx.containersis external I/O, so it lives on actions, never queries or mutations — keeping the deterministic core deterministic. Routing mirrors the data tiers:.get(name)pins one instance per entity (stateful),.any()load-balances a fixed pool (stateless), and.pool()adds retry/backoff over a cold instance. - It reads data through RPC, not the database. Container code calls back into
your app with the bridge client (
@lunora/container/bridge), over the Worker's HTTP RPC endpoint (/_lunora/rpc), authenticated by a bearer yourresolveIdentityvalidates. So a container reads and writes app state through the same queries and mutations the browser uses — it never reaches into SQLite or D1 directly. The serialized DO stays the single linearization point, even for a heavyweight sidecar. - Idle is free, disk is not durable. Instances scale to zero on
sleepAfter(active-CPU billing); the local disk is ephemeral, so persistence goes to R2 via@lunora/storage. Egress is billed, unlike R2.
action ──ctx.containers.transcoder.get(id).fetch()──► Container DO
▲ │
└────────── bridge: POST /_lunora/rpc ◄────────────────┘
(runs your query/mutation as a verified identity)See @lunora/container for the full surface, and
Limits for the ceilings.
The consistency model
This is the part worth internalizing before you design a schema. Each scope below shows its guarantee (the badge) and the mechanism that provides it:
Prop
Type
The hard edge: there is no cross-shard transaction and no cross-shard live
join. A shape whose where() joins two .shardBy() tables on different DOs
is rejected at registration — there's no single serialized cut to diff across.
The two supported answers are denormalize into the shard, or promote the
joined table to .global(). See Local-first sync.
The real-time engine
Two protocols ride the same hibernatable WebSocket.
- Legacy
subscribe— re-runs the query and diffs the result. Simple, always correct, heavier on the DO. - The poke diff protocol (local-first) — the DO reads the op-log page once
per write flush, computes per-shape membership with a single
… IN (<changedIds>) AND <effectiveWhere>query, and pokes each socket the membership diff. The client maintains the live views through TanStack DB's incremental dataflow; the DO never runs a dataflow pipeline. Optimistic custom mutators rebase over the authoritative result on each poke. See Real-time and Local-first sync.
Reconnect is resume-by-bookmark: the client sends the last sequence it acknowledged and the server replays the gap. A subscriber that has fallen behind the op-log retention window is forced to re-seed rather than silently miss rows.
Failure modes
The happy path is above; here is what the architecture does when things break. The theme is fail closed and re-derive, never silently diverge.
- DO evicted mid-request. A mutation is one storage transaction, so an eviction either commits the whole thing or none of it — there is no half-written row. The Worker surfaces the error to the caller; the client's optimistic overlay stays pending and re-pushes (idempotently, by watermark) on reconnect.
- WebSocket drops mid-poke. Pokes apply atomically at
pokeEnd, so a socket that dies betweenpokeStartandpokeEndsimply never applied that batch. On reconnect the checkpoint/epoch won't match and the client re-seeds the shape rather than stitching a partial diff. - Subscriber falls behind op-log retention. If a client's
sinceCheckpointis older than the shard'sminCdcSeq, the diff it needs is gone. The DO refuses to guess and forces a full re-seed — a fallen-behind client never silently misses rows. - Duplicate / out-of-order client push. The watermark is the guard:
seq <= watermarkis acked without re-running (safe replay after a flaky ack);seq > watermark + 1is rejected409 OUT_OF_ORDERand the client resends fromwatermark + 1. Exactly-once effect, at-least-once delivery. - D1 replica lag. A
.global()read without the caller's bookmark may hit a stale replica. Threading thex-d1-bookmarkfrom the last write restores read-your-writes; cross-replica convergence is otherwise bounded by the ~24 h window. - Container crash / cold start.
.get(name)surfaces the error to the action;.pool()rides over it by retrying on a freshly-picked instance with backoff. Because containers reach data only through RPC, a crashed container can never leave storage half-written — the DO transaction is still the only writer. - Hot shard (approaching the ceiling). Not a crash but a capacity wall: the
runtime warns at 1 GB / sustained >700 req/s well before the hard limit, so a
.shardBy()migration has runway. See Limits.
Design decisions
Each decision below is the same shape: what we chose, why, and what it costs. These are the trade-offs that distinguish Lunora from the alternatives.
Durable Objects as the database, not D1-as-truth
Decision. Primary state is per-DO SQLite, not a central D1 (or Postgres) database with the DO as a cache.
Why. A DO is single-threaded and serializes its writes, so the order of
commits is the consistency cut — for free. That single property buys three
things at once: strong consistency without a lock manager, an authoritative
op-log (__cdc_log) without a logical-replication stream to tail, and real-time
fan-out without a process hop (the writer holds the sockets). Hibernation makes
idle cost ≈ $0. A D1-as-truth design needs a version-counter for OCC and a
version-snapshot engine to rebuild reactive state — machinery the DO topology
makes unnecessary.
Cost. No cross-shard transactions; cross-shard reads are fan-out; the 10 GB / ~1 000 req/s per-DO ceiling is real. We answer it with opt-in sharding rather than a bigger central database.
One DO by default, opt-in sharding
Decision. New apps get a single __root__ DO holding every table. Scaling
out is a per-table .shardBy(field) edit, not an architecture.
Why. The first 80% of an app's life fits in one DO, and a single serialized
store is the easiest thing in the world to reason about. Sharding should be a
decision you make when the data tells you to (the 1 GB warning), against the same
call sites — codegen re-addresses ctx.db.messages.*, your handler doesn't
change.
Cost. Codegen has to know how to address shards, and fan-out reads across an
unpinned .shardBy() table are expensive — a Paid-plan, cold-path operation,
not a hot one. The single-DO start also means Lunora's default throughput
ceiling looks lower than a sharded-from-day-one system until you opt in.
Serialized mutations, no OCC-retry loop
Decision. Mutations run inside blockConcurrencyWhile + a storage
transaction. We deliberately do not ship an optimistic-retry loop.
Why. Because the DO serializes, there is no concurrent writer to race. A
ConflictError is therefore a deterministic self-conflict (a handler's own
trigger or cascade contradicting its own write), which a retry would hit again
identically. Retrying it is useless at best and an infinite loop at worst.
Cost. Throughput per DO is bounded by serial execution — another reason the escape valve is sharding, not concurrency within a shard.
Hibernated WebSockets, no external broker
Decision. Real-time fan-out is entirely DO-based: hibernatable WebSocket subscriptions stored on the socket attachment. No MQTT, no Pub/Sub, no Redis, no external message broker.
Why. The subscriptions live exactly where the writes happen, so a delta never leaves the process. Hibernation drops idle compute to zero. And it's type-safe end-to-end with your queries — there's no second schema for an external broker to drift from. Cloudflare Pub/Sub adds only native-MQTT device ingest over this path, a narrow niche, and is itself beta with no Worker binding.
Cost. Fan-out is bounded by the per-DO WebSocket ceiling (~32 000
hibernated). Truly account-wide broadcast (every user, one event) is not the
sweet spot — that's a .global() + poll-refresh shape, not a poke.
D1 for global tables, with Sessions-API read-your-writes
Decision. .global() tables live in D1, and the client threads a D1 session
bookmark for consistency rather than a sticky-session router.
Why. Some data is inherently cross-tenant (users, billing) and can't live in one tenant's shard. D1 gives regional read replicas for low-latency reads; the Sessions API bookmark gives read-your-writes without pinning a user to a region.
Cost. Global tables are eventually consistent across replicas (bounded by the ~24 h bookmark window) and are not poke-live. It's the right tier for the slow-changing spine of an app, the wrong one for a hot feed.
Vite-first DX and build-time codegen
Decision. A Vite plugin owns codegen, the dev server, and server↔client type sync. The schema and functions are the single source of truth; everything else is generated.
Why. End-to-end type safety only holds if there's one source of truth and
the toolchain enforces it. Generating _generated/* at build time means a
renamed field breaks the build in your editor, not in production — and the shard
address is resolved at codegen, so the client never guesses.
Cost. Codegen is a required step in the loop (dist/ is built on demand, and
stale _generated can bite). The dev server boots workerd — the real runtime —
rather than a lightweight mock, trading a little startup time for fidelity.
Functional procedure builders (query / mutation / action)
Decision. Three explicit function kinds with a chainable builder, not a single generic "handler." Queries are deterministic reads, mutations are deterministic serialized writes, actions are the I/O escape hatch.
Why. The kind is the contract. Queries can be cached, subscribed, and
fanned out because they're deterministic; mutations can broadcast because they're
the linearization point; actions are quarantined from ctx.db so the
deterministic core stays deterministic. Splitting them lets the runtime apply the
right machinery to each.
Cost. You have to know which kind you're writing, and "call a mutation from an action" is a deliberate hop rather than a free function call. (Note: query / mutation determinism is a convention the advisor lints, not yet a hard runtime guard.)
Batteries as separate installs, not a bundled framework
Decision. Auth, mail, storage, scheduler, payments, AI, containers, queues,
workflows — each is its own @lunora/* package and an opt-in ctx.* facade.
The umbrella lunorash package bundles only the base.
Why. The Worker bundle ceiling is 10 MB; an app shouldn't pay bundle size for
a provider it never calls. Separate installs also mean no hard dependency on, say,
a specific auth library — @lunora/auth is swappable, the core doesn't know it
exists. Batteries, not lock-in.
Cost. More packages to discover and wire than a monolith. The lunorash
umbrella and vis generate scaffolding exist to smooth that over.
Containers reach data through RPC, not the database
Decision. A container never connects to SQLite or D1. It calls back into the
app over the HTTP RPC bridge, running the same queries and mutations the browser
does, under an identity the Worker's resolveIdentity verifies.
Why. If a heavyweight sidecar could write storage directly, it would bypass the serialized DO — the one property the whole consistency model rests on — and every RLS policy and validator with it. Forcing containers through the same front door keeps a single linearization point and one authorization surface, no matter how exotic the workload behind it is.
Cost. A container pays an HTTP hop to read or write app state, rather than a
local query. For the data-heavy inner loop, hand the container the data in the
action's fetch body instead of round-tripping per row.
Non-goals
Saying no is part of the architecture. Lunora deliberately does not:
- run an external message broker (MQTT / Pub/Sub / Redis) — see above;
- offer cross-shard transactions or cross-shard live joins — denormalize or
.global()instead; - provide an OCC-retry loop on mutations — serialized writes make it pointless;
- enforce query/mutation determinism at runtime yet — it's an advisor lint today;
- ship a managed control plane as a hard dependency — self-host is the default; Lunora Cloud is optional and runs the same code.
Where the seams are
Every tier is replaceable, which is the point of keeping them single-purpose:
- Storage backend —
ShardDOis a base class; the SQLite layer is behind@lunora/do. - Global adapter —
@lunora/d1implements a store interface; other global backends can slot in. - Auth store —
@lunora/authtakes anyAuthStore; D1 is the default, not a requirement. - Framework client — the
@lunora/clientcore is framework-agnostic; react/vue/solid/svelte/astro/nuxt are thin adapters. - Add-on bindings —
ctx.kv,ctx.images,ctx.ai,ctx.sql, … are facades you opt into per package.
See also
- Sharding — the
__root__→.shardBy()→.global()progression - Real-time and Local-first sync — the two live protocols
- Function context — what
ctxexposes per function kind - @lunora/container — Docker workloads,
ctx.containers, and the RPC bridge - Limits — the Cloudflare ceilings these decisions are designed around
- Deployment — platform configuration and operational non-goals