PackagesRatelimit

@lunora/ratelimit

Token-bucket, fixed-window, and sliding-window rate limiting as procedure middleware.

@lunora/ratelimit enforces named rate limits over a pluggable store. You build one RateLimiter per app from a config map, then either attach the rateLimit middleware to a procedure's .use(...) chain or call limiter.limit(...) directly. On rejection the middleware throws a structural LunoraError that the runtime maps to 429 Too Many Requests (or 403 Forbidden for a deny-list hit), carrying retryAfter in milliseconds.

import { RateLimiter, rateLimit } from "@lunora/ratelimit";

import { mutation } from "./_generated/server";

const limiter = new RateLimiter({
    config: {
        login: { kind: "fixed window", period: 60_000, rate: 5 },
        send: { kind: "token bucket", period: 1_000, rate: 10, capacity: 20 },
    },
});

export const send = mutation.use(rateLimit(limiter, "send", { key: (ctx) => ctx.auth.userId })).mutation(async ({ ctx }) => {
    // …
});

Like row-level security and data masking, it rides the .use(...) chain and is opt-in per procedure — a bare query/mutation is never rate-limited.

Algorithms

Each named limit declares a kind. Pick by the burst behavior you want:

kindBehaviorPick it when
"token bucket"Tokens refill continuously at rate / period per ms up to capacity; a fresh key starts full, so a burst is allowed.You want smooth throughput that tolerates short bursts (API calls).
"fixed window"rate tokens granted at the start of each window aligned to start + n * period. With capacity > rate, unused tokens roll over.You want a hard cap per discrete window (e.g. 5 logins per minute).
"sliding window"A weighted estimate blending the current and previous window's counts; always caps at rate per period.You want fixed-window's simplicity without its boundary-burst spike.
const limiter = new RateLimiter({
    config: {
        api: { kind: "token bucket", period: 1_000, rate: 10 },
        login: { kind: "fixed window", period: 60_000, rate: 5 },
        search: { kind: "sliding window", period: 10_000, rate: 30 },
    },
});

Limit config

Each entry in the config map is a RateLimitConfig:

FieldTypeDefaultNotes
kindRateLimitKind"token bucket", "fixed window", or "sliding window". Required.
ratenumberTokens granted per period. Required; must be a positive number.
periodnumberWindow/refill period in milliseconds. Required; must be a positive number.
capacitynumberrateRollover ceiling. Caps a token-bucket burst; for fixed windows enables cross-window rollover. Ignored by sliding windows.
shardsnumber1Split a hot limit across N independent sub-buckets. Positive integer; 1 is equivalent to unset. See Sharding.
startnumber0Phase offset in epoch ms for windowed algorithms — windows align to start + n * period. Ignored by token buckets.

The constructor validates every config up front: a non-positive period or rate, a negative capacity, or a non-integer/< 1 shards throws at construction rather than corrupting accounting later.

Consuming a limit

limiter.limit(name, args) consumes capacity and returns a RateLimitStatus ({ ok, reason?, retryAfter }). limiter.check(name, args) peeks without consuming. limiter.reset(name, { key }) clears accounting for a pair (e.g. on a successful login).

const status = await limiter.limit("send", { key: userId });

if (!status.ok) {
    // status.retryAfter is milliseconds until the request would succeed.
    // status.reason is "rate" or "deny".
}

await limiter.reset("login", { key: userId }); // clear on success

Per-call RateLimitArgs:

OptionTypeDefaultNotes
keystringSub-key isolating the limit (per user/team/IP). Omit for a global limit.
countnumber1Units to consume. Must be a positive integer.
reservebooleanfalsePermit now and reserve future capacity (the stored value goes negative); retryAfter reports when the debt clears. Rejected only when count exceeds capacity.
throwsbooleanfalseThrow RateLimitError instead of returning a failing status.

Procedure middleware

rateLimit(limiter, name, options) returns a Middleware you attach with .use(...). The first argument is a LimiterResolver — either a fixed RateLimiter or a (ctx) => RateLimiter that binds a context-derived limiter at call time (handy for an ORM-backed store, see Stores).

import { RateLimiter, rateLimit, createDbStore } from "@lunora/ratelimit";

import { mutation } from "./_generated/server";

const config = { send: { kind: "token bucket", period: 1_000, rate: 10 } } as const;

export const send = mutation
    .use(
        rateLimit((ctx) => new RateLimiter({ config, store: createDbStore({ db: ctx.db }) }), "send", {
            key: (ctx) => ctx.auth.userId,
        }),
    )
    .mutation(async ({ ctx, args }) => {
        // …
    });

For the common DB-backed case, dbRateLimit(config, name, options) is shorthand for the resolver above — it builds the per-call RateLimiter + createDbStore for you:

import { dbRateLimit } from "@lunora/ratelimit";

import { mutation } from "./_generated/server";

const config = { send: { kind: "token bucket", period: 1_000, rate: 10 } } satisfies RateLimitConfigMap;

export const send = mutation.use(dbRateLimit(config, "send", { key: (ctx) => ctx.auth.userId ?? "anonymous" })).mutation(async ({ ctx, args }) => {
    // …
});

Pass options.store to point at a non-default backing table/index/key column; the rest of options matches rateLimit's.

RateLimitMiddlewareOptions:

OptionTypeDefaultNotes
key(ctx) => string | undefinedSub-key derived from context. Omit for a global limit.
countnumber1Units to consume per call.
messagestringOverride the error message thrown on rejection.
failOpenbooleanfalseBehavior when the limiter itself throws (store unavailable, etc). See below.

Failure policy: the middleware fails closed by default — if resolving or invoking the limiter throws, it logs via console.error and rejects with 503. This is the safer default for security-sensitive limits (auth, account creation). Set failOpen: true only when degraded availability is preferable to refusal — a failing limiter then admits every request.

Deny list

A denyList passed to the constructor short-circuits before any token accounting: a matching key is denied with reason: "deny" and retryAfter: Infinity (the middleware maps this to 403 Forbidden).

const limiter = new RateLimiter({
    config: { api: { kind: "token bucket", period: 1_000, rate: 10 } },
    denyList: ["198.51.100.7", "abuser@example.com"],
});

The list is consulted as-is. If you also pass a normalize function (applied to every incoming key for case-folding, trimming, or canonicalizing IPs/emails), both the normalized and raw form are checked — but normalize your deny-list entries up front to be safe.

Sharding

For a high-volume limit where a single hot key contends on one bucket/Durable Object, set shards: N. The limiter splits the limit into N independent sub-buckets, each enforcing rate / shards (and capacity / shards). A request is routed to a shard by a deterministic hash of (name, key), so the same key always lands on the same shard — its effective throughput is exactly rate / shards, while aggregate throughput across many distinct keys approaches rate as keys spread uniformly.

const limiter = new RateLimiter({
    config: {
        ingest: { kind: "token bucket", period: 1_000, rate: 10_000, shards: 16 },
    },
});

Reserve sharding for limits where contention actually bites; leave it unset (one bucket) otherwise.

Pluggable stores

Persistence is a RateLimitStore (get/set/delete, sync or async). Three are shipped:

FactoryBackingUse when
createMemoryStore()An in-process Map. The default when no store is given.Single-DO/process limits where state needn't survive eviction.
createSqlStore({ sql })state.storage.sql (workerd SqlStorage, also node:sqlite).Durable per-DO state that survives hibernation — you have raw SQL.
createDbStore({ db })A Lunora table via ctx.db (the ORM writer on a mutation/action).Durable per-DO limits inside a procedure, where there is no raw SQL.

createSqlStore creates its table (default _lunora_rate_limits) if missing. createDbStore expects a table you declare in your schema with a key column and its index (defaults: table rateLimits, index by_key, key column key):

import { defineTable, v } from "lunorash/server";

export const rateLimits = defineTable({
    key: v.string(),
    ts: v.number(),
    value: v.number(),
    prev: v.optional(v.number()),
}).index("by_key", ["key"]);

Inside a Durable Object the input gate serializes the limiter's read-modify-write against storage, so both SQL- and ORM-backed stores are atomic against concurrent calls without an explicit transaction. Driving createSqlStore outside a DO, you own serialization yourself.

Plugin form

ratelimitPlugin(limiter) packages the limiter as a first-party Plugin that exposes the resolved RateLimiter under ctx.api.ratelimit, so a handler can call limit()/check()/reset() programmatically instead of (or alongside) the enforcing middleware. Install its middleware with one .use(...):

import { RateLimiter, ratelimitPlugin } from "@lunora/ratelimit";

import { mutation } from "./_generated/server";

const limiter = new RateLimiter({
    config: { send: { kind: "token bucket", period: 60_000, rate: 5, capacity: 5 } },
});

export const send = mutation.use(ratelimitPlugin(limiter).middleware!).mutation(async ({ ctx }) => {
    const status = await ctx.api.ratelimit.limit("send", { key: ctx.auth.userId });

    if (!status.ok) {
        throw new Error("slow down");
    }
    // …
});

The plugin ships no schema extension — persistence is whatever store the limiter was built with — so it is middleware-only and skipped by the plugin schema fold.

See also