Rate limits.
Aria enforces a sliding-window limit per user on the live endpoint and per IP on the demo. Both honor the Retry-After header on rejection.
Policy
| Surface | Key | Window | Limit |
|---|---|---|---|
Live chat POST /api/chat | aria:<userId> | 60 seconds sliding | 60 requests |
Demo chat POST /api/chat/demo | aria:demo:<ip> | 1 hour sliding | 15 requests |
Backing store
When UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN are set, the limiter uses Upstash Redis and is safe across multi-instance deployments. Without Upstash, the limiter falls back to an in-process map — fine for development, not for production.
Single-instance fallback
The in-memory limiter resets on every process restart and does not synchronize across server replicas. Deploy to Upstash before any multi-replica production cutover; otherwise a user can exceed the stated limit by routing through different instances.
On rejection
The server responds 429 RATE_LIMITED with a plain-text body (chat surfaces) or the structured error envelope (retrieval surfaces, planned). Both include Retry-After in seconds.
HTTP/1.1 429 Too Many Requests
Content-Type: text/plain; charset=utf-8
Retry-After: 18
X-Request-Id: req_01HW…
You're sending messages a little too fast. Give Aria 18s and try again.Planned rate-limit headers
A consistent header set is planned on every rate-limit-sensitive response — not only on 429. Until the rollout lands, Retry-After is only present on 429 responses.
X-RateLimit-Limit— total allowed in the current window.X-RateLimit-Remaining— remaining capacity.X-RateLimit-Reset— seconds until the window clears or the oldest request ages out.Retry-After— seconds to wait before the next request. Set only on 429 today; will become authoritative on all 429 and 503 responses.
Client patterns
- Back off on 429. Honor
Retry-After. Do not retry sooner. - Back off on 503. Use exponential backoff (1s, 2s, 4s, 8s, cap at 30s) on
UPSTREAM_OVERLOADED. Stop after five attempts. - Do not retry 4xx. Aside from 429, 4xx errors reflect structural problems that retries cannot fix.
- Pace UI interactions. A chat UI should disable send while a turn is in flight — parallel turns from the same user will trip the limiter.
See also: Errors · Authentication.