@lunora/ai
Workers AI inference from your functions, provider-agnostic, Workers AI by default.
@lunora/ai is a small helper over the Vercel AI SDK and
Cloudflare's workers-ai-provider. Call
generateText / streamText / generateObject / embed / tool from any
function. Workers AI is the zero-config default, but every call is
provider-agnostic: pass a Workers AI model id (a string) or any AI SDK model
object (@ai-sdk/openai, @ai-sdk/anthropic, OpenRouter, …), so your app is
never locked to Workers AI.
pnpm add @lunora/aiWhen a function uses AI, the dev server / lunora prepare reconciles the ai
binding into wrangler.jsonc for you ({ "ai": { "binding": "AI" } }), and
codegen wires a typed ctx.ai onto your action contexts.
ctx.ai in an action
Inference is an external, non-deterministic call — like ctx.fetch, ctx.ai
lives on actions, not queries or mutations.
import { action, v } from "@/lunora/_generated/server";
import { generateText } from "@lunora/ai";
export const summarize = action.input({ text: v.string() }).action(async ({ ctx, args: { text } }) => {
const { text: summary } = await generateText({
model: ctx.ai.model("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
prompt: `Summarize:\n\n${text}`,
});
return summary;
});ctx.ai.model(id) resolves a Workers AI model from the binding. Pass the
resolved model to the AI SDK functions re-exported from @lunora/ai
(generateText, streamText, generateObject, streamObject, embed,
embedMany, tool).
Any provider, same call
A string id resolves Workers AI; an AI SDK model object passes straight through.
import { streamText } from "@lunora/ai";
import { openai } from "@ai-sdk/openai"; // optional, bring-your-own
const result = streamText({ model: openai("gpt-5"), messages });Install the provider you want (@ai-sdk/openai, @ai-sdk/anthropic, …)
alongside @lunora/ai; route through a Cloudflare AI
Gateway by passing gateway to
createAi.
Structured output
import { generateObject } from "@lunora/ai";
import { z } from "zod";
const { object } = await generateObject({
model: ctx.ai.model("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
schema: z.object({ sentiment: z.enum(["positive", "neutral", "negative"]) }),
prompt: review,
});RAG with @lunora/bindings/vectors
Embed with the AI SDK, store and search with @lunora/bindings/vectors:
import { embed } from "@lunora/ai";
const { embedding } = await embed({
model: ctx.ai.embeddingModel("@cf/baai/bge-base-en-v1.5"),
value: text,
});
await ctx.vectors.upsert("docs-body", { id, input: text, embed: async () => embedding });Outside an action
ctx.ai is only wired onto action contexts. In the worker entry, a Durable
Object, or a queue / scheduled handler, build the helper directly from the
binding:
import { createAi } from "@lunora/ai";
const ai = createAi({ binding: env.AI });The raw binding escape hatch — ctx.ai.run(model, inputs) (or ai.run(...)) —
covers Workers-AI-only model families (image, ASR, translation) that aren't
surfaced through the AI SDK provider.
See also
- @lunora/server — the function primitives whose
ctx.aithis package backs - @lunora/bindings — pair
embedwithctx.vectors(Vectorize) for RAG