PackagesAi

@lunora/ai

Workers AI inference from your functions, provider-agnostic, Workers AI by default.

@lunora/ai is a small helper over the Vercel AI SDK and Cloudflare's workers-ai-provider. Call generateText / streamText / generateObject / embed / tool from any function. Workers AI is the zero-config default, but every call is provider-agnostic: pass a Workers AI model id (a string) or any AI SDK model object (@ai-sdk/openai, @ai-sdk/anthropic, OpenRouter, …), so your app is never locked to Workers AI.

pnpm add @lunora/ai

When a function uses AI, the dev server / lunora prepare reconciles the ai binding into wrangler.jsonc for you ({ "ai": { "binding": "AI" } }), and codegen wires a typed ctx.ai onto your action contexts.

ctx.ai in an action

Inference is an external, non-deterministic call — like ctx.fetch, ctx.ai lives on actions, not queries or mutations.

import { action, v } from "@/lunora/_generated/server";
import { generateText } from "@lunora/ai";

export const summarize = action.input({ text: v.string() }).action(async ({ ctx, args: { text } }) => {
    const { text: summary } = await generateText({
        model: ctx.ai.model("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
        prompt: `Summarize:\n\n${text}`,
    });

    return summary;
});

ctx.ai.model(id) resolves a Workers AI model from the binding. Pass the resolved model to the AI SDK functions re-exported from @lunora/ai (generateText, streamText, generateObject, streamObject, embed, embedMany, tool).

Any provider, same call

A string id resolves Workers AI; an AI SDK model object passes straight through.

import { streamText } from "@lunora/ai";
import { openai } from "@ai-sdk/openai"; // optional, bring-your-own

const result = streamText({ model: openai("gpt-5"), messages });

Install the provider you want (@ai-sdk/openai, @ai-sdk/anthropic, …) alongside @lunora/ai; route through a Cloudflare AI Gateway by passing gateway to createAi.

Structured output

import { generateObject } from "@lunora/ai";
import { z } from "zod";

const { object } = await generateObject({
    model: ctx.ai.model("@cf/meta/llama-3.3-70b-instruct-fp8-fast"),
    schema: z.object({ sentiment: z.enum(["positive", "neutral", "negative"]) }),
    prompt: review,
});

RAG with @lunora/bindings/vectors

Embed with the AI SDK, store and search with @lunora/bindings/vectors:

import { embed } from "@lunora/ai";

const { embedding } = await embed({
    model: ctx.ai.embeddingModel("@cf/baai/bge-base-en-v1.5"),
    value: text,
});
await ctx.vectors.upsert("docs-body", { id, input: text, embed: async () => embedding });

Outside an action

ctx.ai is only wired onto action contexts. In the worker entry, a Durable Object, or a queue / scheduled handler, build the helper directly from the binding:

import { createAi } from "@lunora/ai";

const ai = createAi({ binding: env.AI });

The raw binding escape hatch — ctx.ai.run(model, inputs) (or ai.run(...)) — covers Workers-AI-only model families (image, ASR, translation) that aren't surfaced through the AI SDK provider.

See also