PackagesWorkflow

@lunora/workflow

Durable, multi-step execution that replays deterministically — built on Cloudflare Workflows.

@lunora/workflow gives Lunora durable workflows: chain steps into a single unit that survives Worker restarts and redeploys and replays deterministically on failure. Each step's result is persisted and memoized — when a later step throws, only that step retries; the completed ones don't run again.

It wraps Cloudflare Workflows (a first-party, GA durable-execution engine) with type-safe authoring: you write a defineWorkflow and codegen emits the WorkflowEntrypoint class, wires a typed ctx.workflows handle, and reconciles the [[workflows]] binding in wrangler.jsonc for you.

Use it for sagas, multi-stage AI pipelines, human-in-the-loop approvals, and any long-running orchestration where "re-run the whole job from scratch" isn't good enough.

Define a workflow

Workflows live in lunora/workflows.ts as named exports. Scaffold one with vis generate lunora-workflow --name=orderPipeline, or write it by hand:

// lunora/workflows.ts
import { defineWorkflow } from "@lunora/workflow";

import { api } from "./_generated/api";
import type { Id } from "./_generated/server";

export const channelWelcome = defineWorkflow<
    { channelId: Id<"channels"> }, // params (input)
    { channelId: Id<"channels">; posted: number } // output
>({
    handler: async (context) => {
        const { channelId } = context.params;

        // A durable, memoized, retried step.
        await context.step.do("greet", () => context.run(api.messages.send, { channelId, text: "👋 Welcome!" }, { shardKey: channelId }));

        // The workflow hibernates here and resumes after a minute —
        // the delay survives Worker evictions and redeploys.
        await context.step.sleep("settle", "1 minute");

        await context.step.do("tips", () => context.run(api.messages.send, { channelId, text: "Tip: messages stream live." }, { shardKey: channelId }));

        context.log.info("welcome sequence complete", { channelId });

        return { channelId, posted: 2 };
    },
});

defineWorkflow<Params, Output> takes a single config object:

  • handler — the workflow body; receives the run context (below).
  • name (optional) — override the deployed workflow name. Defaults to the kebab-cased export name (orderPipelineorder-pipeline).

The run context

The handler receives one argument bundling the durable-step API and a typed function caller:

MemberWhat it is
context.paramsThe input passed to .create({ params }), typed as Params.
context.stepThe durable-step API — do / sleep / sleepUntil / waitForEvent.
context.runCall a Lunora function (api.*) by typed reference; pass { shardKey } to target a shard.
context.runStepRun a reusable, schema-validated defineStep as a durable step (see below).
context.parallelFan out branches as isolated child workflow instances and await their outputs (see below).
context.spawnFire-and-forget start of a declared child workflow (replay-safe; returns a live handle).
context.logStructured logger (debug/info/warn/error), captured by wrangler tail and the Studio logs.
context.eventThe raw triggering event (id, payload, timestamp).
context.envThe Worker environment bindings.

Durable steps

context.step is the Cloudflare Workflows step API:

// Memoized + retried. On replay, a completed step returns its stored
// result without re-running the callback.
const order = await context.step.do("load", () => context.run(api.orders.get, { id: orderId }, { shardKey: orderId }));

// Per-step retry policy.
await context.step.do("charge", { retries: { limit: 3, delay: "2 seconds", backoff: "exponential" }, timeout: "30 seconds" }, () =>
    context.run(api.payments.charge, { orderId }),
);

// Durable delays — minutes to weeks, not held in memory.
await context.step.sleep("cool-off", "1 minute");
await context.step.sleepUntil("run-at", new Date("2026-07-01T00:00:00Z"));

// Pause until an external event (approvals, webhooks, human-in-the-loop).
const approval = await context.step.waitForEvent<{ approved: boolean }>("await-approval", {
    type: "approval",
    timeout: "7 days",
});

Steps must be JSON-serialisable and idempotent — the same determinism contract Cloudflare (and Convex) impose. A step body may run more than once on retry, so don't rely on side effects outside the step's return value.

Reusable steps (defineStep)

An inline context.step.do("name", () => …) is fine for one-offs, but defineStep lets you define a step once — schema-validated and reusable across workflows. Args are validated (with @lunora/values) before the body runs, and the return value is validated after (when you declare returns), so a bad payload fails fast instead of corrupting later steps. Scaffold one with vis generate lunora-step --name=chargeOrder (appends to lunora/steps.ts), or write it by hand:

// lunora/steps.ts
import { defineStep } from "@lunora/workflow";
import { v } from "@lunora/values";

import { api } from "./_generated/api";

export const charge = defineStep("charge", {
    args: { orderId: v.string(), amount: v.number() },
    returns: v.object({ receiptId: v.string() }),
    config: { retries: { limit: 3, backoff: "exponential", delay: "10 seconds" } },
    handler: async (context, { orderId, amount }) => {
        if (context.attempt > 1) context.log.warn(`retrying charge for ${orderId}`);

        return context.run(api.payments.charge, { orderId, amount });
    },
    // Compensation — runs if a *later* step fails after this one committed.
    rollback: async (context) => {
        await context.run(api.payments.refund, { orderId: context.args.orderId });
    },
});

Run it from a workflow body — context.runStep wraps it in a durable step.do(...) for you:

export const orderPipeline = defineWorkflow<{ orderId: string }>({
    handler: async (context) => {
        const { receiptId } = await context.runStep(charge, { orderId: context.params.orderId, amount: 4200 });

        return { receiptId };
    },
});

The step handler context gives you context.attempt (1-based retry counter), context.config, context.env, context.run, context.log, and context.step ({ name, count }). Pass context.runStep(step, args, { config }) to override the step's durability config for a single call.

Fan-out with child-DO isolation (ctx.parallel / ctx.spawn)

A workflow instance is one Durable Object: ~128 MB memory and a 5-minute CPU budget, shared across everything it runs. Fanning out heavy work with Promise.all(branches.map((b) => ctx.runStep(b, …))) therefore crowds every branch onto that one budget — one OOM or timeout takes the whole batch down.

ctx.parallel runs each branch as its own child workflow instance — its own DO, its own memory / CPU / retry budget — and the parent hibernates (zero cost) until the branches report back. Branches reference declared child workflows by their lunora/workflows.ts export name; branch<Output>(name, params) carries the result type into the returned tuple, in declaration order:

import { branch, defineWorkflow } from "@lunora/workflow";

export const mediaPipeline = defineWorkflow<{ key: string }>({
    handler: async (ctx) => {
        // Each branch is a separate declared workflow running in its own DO.
        const [labels, thumb, transcript] = await ctx.parallel([
            branch<{ tags: string[] }>("imageTag", { key: ctx.params.key }),
            branch<{ url: string }>("thumbnail", { key: ctx.params.key }),
            branch<{ text: string }>("transcribe", { key: ctx.params.key }),
        ]);

        return { labels, thumb, transcript };
    },
});
  • Isolated resources — each branch gets a full DO budget; a heavy branch can't starve its siblings.
  • Hibernating join — the parent consumes nothing while branches run; each child signals its result back when it finishes.
  • Fail-fast — if any branch fails, ctx.parallel rejects with that branch's error (non-retryable — retrying the join can't re-run an already-failed child). Still-running siblings are left to finish (Cloudflare can't cleanly cancel a running instance).
  • Replay-safe — child instance ids are derived deterministically from the parent, so a parent replay re-attaches to the existing children instead of double-spawning. There is a cap of MAX_BRANCHES (100) per call.

For fire-and-forget (start a child pipeline without awaiting it), use ctx.spawn(name, params), which returns a live instance handle:

await ctx.spawn("sendReceipt", { orderId: ctx.params.orderId });

Both reach child workflows through the standard WORKFLOW_* bindings codegen already wires — no extra configuration. A branch/spawn name with no matching declared workflow throws a helpful error.

Rollback (saga compensation)

A step's optional rollback handler is forwarded to Cloudflare's native step rollback — it runs when a later step in the same instance fails, letting you undo a committed side effect (refund a charge, delete an uploaded object). The rollback context carries the original args, the error that triggered it, the step's output (if it completed), and env / run / log. Cloudflare owns rollback ordering and execution; defineStep just wires the handler and an optional rollbackConfig.

Failing without retries (NonRetryableError)

Throw NonRetryableError from a step or the handler to fail the instance immediately, skipping retries — the portable, Node-importable mirror of cloudflare:workflows' native error (so your workflow code stays unit-testable). The runtime converts it to the native error at the workflow boundary.

import { NonRetryableError } from "@lunora/workflow";

export const charge = defineStep("charge", {
    args: { orderId: v.string() },
    handler: async (context, { orderId }) => {
        const order = await context.run(api.orders.get, { id: orderId });

        if (order.status === "cancelled") {
            throw new NonRetryableError("order cancelled — retrying will never succeed");
        }

        return context.run(api.payments.charge, { orderId });
    },
});

Start and manage instances

Codegen wires a typed ctx.workflows handle onto mutations and actions. Start an instance from a function:

// lunora/channels.ts
import { mutation, v } from "./_generated/server";

export const create = mutation.input({ channelId: v.id("channels") }).mutation(async ({ ctx, args: { channelId } }) => {
    const instance = await ctx.workflows.get("channelWelcome").create({ params: { channelId } });

    return { instanceId: instance.id };
});

ctx.workflows.get("<exportName>") returns a handle with:

  • create({ id?, params?, retention? }) — start an instance. create() resolves once the instance is queued (not when it finishes); the workflow runs on Cloudflare's infrastructure.
  • createBatch([...]) — start many in one batch.
  • get(id) — a handle to an existing instance.

An instance handle exposes its lifecycle: status() (returns status + output/error), pause(), resume(), terminate(), and sendEvent({ type, payload }) (to satisfy a waitForEvent).

Listing instances and step timelines

The Workflow binding can only create/get and read a single instance's status — it has no instance list and no per-step detail. To list instances or read a step timeline (what the Studio renders), use createWorkflowsRestClient, which talks to Cloudflare's account-scoped Workflows REST API. The API token is a secret, so keep this server-side:

import { createWorkflowsRestClient } from "@lunora/workflow";

const client = createWorkflowsRestClient({
    accountId: env.CLOUDFLARE_ACCOUNT_ID,
    apiToken: env.CLOUDFLARE_API_TOKEN, // scope: Workflows Read (Edit for setInstanceStatus)
});

const page = await client.listInstances({ workflowName: "order-pipeline", status: "running" });
const detail = await client.getInstance({ workflowName: "order-pipeline", instanceId: page.instances[0].id });

getInstance returns the instance plus a normalized steps[] timeline (each step's type, attempts, timing, output, and error). setInstanceStatus({ action }) pauses, resumes, or terminates an instance and requires an Edit-scoped token.

Wiring (handled by codegen)

After adding lunora/workflows.ts, re-run codegen (lunora dev does this on save). It:

  1. Emits a WorkflowEntrypoint subclass per export into lunora/_generated/workflows.ts (channelWelcomeChannelWelcomeWorkflow).
  2. Wires the typed ctx.workflows handle.
  3. Reconciles wrangler.jsonc's workflows[] array — binding, class_name, and name — from the same definition.

Your worker entry must re-export the generated classes so wrangler can find them:

// src/server/index.ts
export { ChannelWelcomeWorkflow } from "../../lunora/_generated/workflows";

The resulting binding:

// wrangler.jsonc (codegen-reconciled)
{
    "workflows": [{ "binding": "WORKFLOW_CHANNEL_WELCOME", "class_name": "ChannelWelcomeWorkflow", "name": "channel-welcome" }],
}

Advisor lints

Two static advisors keep workflow wiring honest:

  • workflow_unknown_target (error) — ctx.workflows.get("name") references a workflow that doesn't exist (typo or removed export).
  • workflow_unused (info) — a declared workflow is never started by any function (dead code that's still billed as a WorkflowEntrypoint), unless it's triggered externally.

Studio

The Studio's Workflows panel (under the Functions domain) lists every declared workflow with its export name, generated class, binding, and deployed name; lets you start an instance from a JSON-params form; and shows a table of instances with their live status (queued / running / complete / errored) and output. See the Studio guide.

Cloudflare bills for workflow instance-state storage. Keep step payloads small and set a retention policy when you don't need long-lived instance history.

See also

  • SchedulerrunAfter / runAt / cron, for single-shot deferred work that doesn't need replay.
  • Studio — the Workflows inspector.
  • Advisors — the workflow lints.