@lunora/workflow
Durable, multi-step execution that replays deterministically — built on Cloudflare Workflows.
@lunora/workflow gives Lunora durable workflows: chain steps into a single
unit that survives Worker restarts and redeploys and replays
deterministically on failure. Each step's result is persisted and memoized —
when a later step throws, only that step retries; the completed ones don't run
again.
It wraps Cloudflare Workflows
(a first-party, GA durable-execution engine) with type-safe authoring: you
write a defineWorkflow and codegen emits the WorkflowEntrypoint class, wires
a typed ctx.workflows handle, and reconciles the [[workflows]] binding in
wrangler.jsonc for you.
Use it for sagas, multi-stage AI pipelines, human-in-the-loop approvals, and any long-running orchestration where "re-run the whole job from scratch" isn't good enough.
Define a workflow
Workflows live in lunora/workflows.ts as named exports. Scaffold one with
vis generate lunora-workflow --name=orderPipeline, or write it by hand:
// lunora/workflows.ts
import { defineWorkflow } from "@lunora/workflow";
import { api } from "./_generated/api";
import type { Id } from "./_generated/server";
export const channelWelcome = defineWorkflow<
{ channelId: Id<"channels"> }, // params (input)
{ channelId: Id<"channels">; posted: number } // output
>({
handler: async (context) => {
const { channelId } = context.params;
// A durable, memoized, retried step.
await context.step.do("greet", () => context.run(api.messages.send, { channelId, text: "👋 Welcome!" }, { shardKey: channelId }));
// The workflow hibernates here and resumes after a minute —
// the delay survives Worker evictions and redeploys.
await context.step.sleep("settle", "1 minute");
await context.step.do("tips", () => context.run(api.messages.send, { channelId, text: "Tip: messages stream live." }, { shardKey: channelId }));
context.log.info("welcome sequence complete", { channelId });
return { channelId, posted: 2 };
},
});defineWorkflow<Params, Output> takes a single config object:
handler— the workflow body; receives the run context (below).name(optional) — override the deployed workflow name. Defaults to the kebab-cased export name (orderPipeline→order-pipeline).
The run context
The handler receives one argument bundling the durable-step API and a typed function caller:
| Member | What it is |
|---|---|
context.params | The input passed to .create({ params }), typed as Params. |
context.step | The durable-step API — do / sleep / sleepUntil / waitForEvent. |
context.run | Call a Lunora function (api.*) by typed reference; pass { shardKey } to target a shard. |
context.runStep | Run a reusable, schema-validated defineStep as a durable step (see below). |
context.parallel | Fan out branches as isolated child workflow instances and await their outputs (see below). |
context.spawn | Fire-and-forget start of a declared child workflow (replay-safe; returns a live handle). |
context.log | Structured logger (debug/info/warn/error), captured by wrangler tail and the Studio logs. |
context.event | The raw triggering event (id, payload, timestamp). |
context.env | The Worker environment bindings. |
Durable steps
context.step is the Cloudflare Workflows step API:
// Memoized + retried. On replay, a completed step returns its stored
// result without re-running the callback.
const order = await context.step.do("load", () => context.run(api.orders.get, { id: orderId }, { shardKey: orderId }));
// Per-step retry policy.
await context.step.do("charge", { retries: { limit: 3, delay: "2 seconds", backoff: "exponential" }, timeout: "30 seconds" }, () =>
context.run(api.payments.charge, { orderId }),
);
// Durable delays — minutes to weeks, not held in memory.
await context.step.sleep("cool-off", "1 minute");
await context.step.sleepUntil("run-at", new Date("2026-07-01T00:00:00Z"));
// Pause until an external event (approvals, webhooks, human-in-the-loop).
const approval = await context.step.waitForEvent<{ approved: boolean }>("await-approval", {
type: "approval",
timeout: "7 days",
});Steps must be JSON-serialisable and idempotent — the same determinism contract Cloudflare (and Convex) impose. A step body may run more than once on retry, so don't rely on side effects outside the step's return value.
Reusable steps (defineStep)
An inline context.step.do("name", () => …) is fine for one-offs, but
defineStep lets you define a step once — schema-validated and reusable
across workflows. Args are validated (with @lunora/values)
before the body runs, and the return value is validated after (when you
declare returns), so a bad payload fails fast instead of corrupting later
steps. Scaffold one with vis generate lunora-step --name=chargeOrder (appends
to lunora/steps.ts), or write it by hand:
// lunora/steps.ts
import { defineStep } from "@lunora/workflow";
import { v } from "@lunora/values";
import { api } from "./_generated/api";
export const charge = defineStep("charge", {
args: { orderId: v.string(), amount: v.number() },
returns: v.object({ receiptId: v.string() }),
config: { retries: { limit: 3, backoff: "exponential", delay: "10 seconds" } },
handler: async (context, { orderId, amount }) => {
if (context.attempt > 1) context.log.warn(`retrying charge for ${orderId}`);
return context.run(api.payments.charge, { orderId, amount });
},
// Compensation — runs if a *later* step fails after this one committed.
rollback: async (context) => {
await context.run(api.payments.refund, { orderId: context.args.orderId });
},
});Run it from a workflow body — context.runStep wraps it in a durable
step.do(...) for you:
export const orderPipeline = defineWorkflow<{ orderId: string }>({
handler: async (context) => {
const { receiptId } = await context.runStep(charge, { orderId: context.params.orderId, amount: 4200 });
return { receiptId };
},
});The step handler context gives you context.attempt (1-based retry counter),
context.config, context.env, context.run, context.log, and
context.step ({ name, count }). Pass
context.runStep(step, args, { config }) to override the step's durability
config for a single call.
Fan-out with child-DO isolation (ctx.parallel / ctx.spawn)
A workflow instance is one Durable Object: ~128 MB memory and a 5-minute CPU
budget, shared across everything it runs. Fanning out heavy work with
Promise.all(branches.map((b) => ctx.runStep(b, …))) therefore crowds every
branch onto that one budget — one OOM or timeout takes the whole batch down.
ctx.parallel runs each branch as its own child workflow instance — its own
DO, its own memory / CPU / retry budget — and the parent hibernates (zero cost)
until the branches report back. Branches reference declared child workflows by
their lunora/workflows.ts export name; branch<Output>(name, params) carries
the result type into the returned tuple, in declaration order:
import { branch, defineWorkflow } from "@lunora/workflow";
export const mediaPipeline = defineWorkflow<{ key: string }>({
handler: async (ctx) => {
// Each branch is a separate declared workflow running in its own DO.
const [labels, thumb, transcript] = await ctx.parallel([
branch<{ tags: string[] }>("imageTag", { key: ctx.params.key }),
branch<{ url: string }>("thumbnail", { key: ctx.params.key }),
branch<{ text: string }>("transcribe", { key: ctx.params.key }),
]);
return { labels, thumb, transcript };
},
});- Isolated resources — each branch gets a full DO budget; a heavy branch can't starve its siblings.
- Hibernating join — the parent consumes nothing while branches run; each child signals its result back when it finishes.
- Fail-fast — if any branch fails,
ctx.parallelrejects with that branch's error (non-retryable — retrying the join can't re-run an already-failed child). Still-running siblings are left to finish (Cloudflare can't cleanly cancel a running instance). - Replay-safe — child instance ids are derived deterministically from the
parent, so a parent replay re-attaches to the existing children instead of
double-spawning. There is a cap of
MAX_BRANCHES(100) per call.
For fire-and-forget (start a child pipeline without awaiting it), use
ctx.spawn(name, params), which returns a live instance handle:
await ctx.spawn("sendReceipt", { orderId: ctx.params.orderId });Both reach child workflows through the standard WORKFLOW_* bindings codegen
already wires — no extra configuration. A branch/spawn name with no matching
declared workflow throws a helpful error.
Rollback (saga compensation)
A step's optional rollback handler is forwarded to Cloudflare's native
step rollback — it runs when a later step in the same instance fails, letting
you undo a committed side effect (refund a charge, delete an uploaded object).
The rollback context carries the original args, the error that triggered it,
the step's output (if it completed), and env / run / log. Cloudflare
owns rollback ordering and execution; defineStep just wires the handler and an
optional rollbackConfig.
Failing without retries (NonRetryableError)
Throw NonRetryableError from a step or the handler to fail the instance
immediately, skipping retries — the portable, Node-importable mirror of
cloudflare:workflows' native error (so your workflow code stays
unit-testable). The runtime converts it to the native error at the workflow
boundary.
import { NonRetryableError } from "@lunora/workflow";
export const charge = defineStep("charge", {
args: { orderId: v.string() },
handler: async (context, { orderId }) => {
const order = await context.run(api.orders.get, { id: orderId });
if (order.status === "cancelled") {
throw new NonRetryableError("order cancelled — retrying will never succeed");
}
return context.run(api.payments.charge, { orderId });
},
});Start and manage instances
Codegen wires a typed ctx.workflows handle onto mutations and actions.
Start an instance from a function:
// lunora/channels.ts
import { mutation, v } from "./_generated/server";
export const create = mutation.input({ channelId: v.id("channels") }).mutation(async ({ ctx, args: { channelId } }) => {
const instance = await ctx.workflows.get("channelWelcome").create({ params: { channelId } });
return { instanceId: instance.id };
});ctx.workflows.get("<exportName>") returns a handle with:
create({ id?, params?, retention? })— start an instance.create()resolves once the instance is queued (not when it finishes); the workflow runs on Cloudflare's infrastructure.createBatch([...])— start many in one batch.get(id)— a handle to an existing instance.
An instance handle exposes its lifecycle: status() (returns
status + output/error), pause(), resume(), terminate(), and
sendEvent({ type, payload }) (to satisfy a waitForEvent).
Listing instances and step timelines
The Workflow binding can only create/get and read a single instance's
status — it has no instance list and no per-step detail. To list instances or
read a step timeline (what the Studio renders), use createWorkflowsRestClient,
which talks to Cloudflare's account-scoped Workflows REST API. The API token is
a secret, so keep this server-side:
import { createWorkflowsRestClient } from "@lunora/workflow";
const client = createWorkflowsRestClient({
accountId: env.CLOUDFLARE_ACCOUNT_ID,
apiToken: env.CLOUDFLARE_API_TOKEN, // scope: Workflows Read (Edit for setInstanceStatus)
});
const page = await client.listInstances({ workflowName: "order-pipeline", status: "running" });
const detail = await client.getInstance({ workflowName: "order-pipeline", instanceId: page.instances[0].id });getInstance returns the instance plus a normalized steps[] timeline (each
step's type, attempts, timing, output, and error).
setInstanceStatus({ action }) pauses, resumes, or terminates an instance and
requires an Edit-scoped token.
Wiring (handled by codegen)
After adding lunora/workflows.ts, re-run codegen (lunora dev does this on
save). It:
- Emits a
WorkflowEntrypointsubclass per export intolunora/_generated/workflows.ts(channelWelcome→ChannelWelcomeWorkflow). - Wires the typed
ctx.workflowshandle. - Reconciles
wrangler.jsonc'sworkflows[]array —binding,class_name, andname— from the same definition.
Your worker entry must re-export the generated classes so wrangler can find them:
// src/server/index.ts
export { ChannelWelcomeWorkflow } from "../../lunora/_generated/workflows";The resulting binding:
// wrangler.jsonc (codegen-reconciled)
{
"workflows": [{ "binding": "WORKFLOW_CHANNEL_WELCOME", "class_name": "ChannelWelcomeWorkflow", "name": "channel-welcome" }],
}Advisor lints
Two static advisors keep workflow wiring honest:
workflow_unknown_target(error) —ctx.workflows.get("name")references a workflow that doesn't exist (typo or removed export).workflow_unused(info) — a declared workflow is never started by any function (dead code that's still billed as aWorkflowEntrypoint), unless it's triggered externally.
Studio
The Studio's Workflows panel (under the Functions domain) lists every declared workflow with its export name, generated class, binding, and deployed name; lets you start an instance from a JSON-params form; and shows a table of instances with their live status (queued / running / complete / errored) and output. See the Studio guide.
Cloudflare bills for workflow instance-state storage. Keep step payloads small and set a retention policy when you don't need long-lived instance history.