Live chat.
Authenticated streaming chat endpoint. Persists every turn to Postgres, composes the 8-layer system prompt, and returns the conversation ID on a response header.
Contract
/api/chatStableStream a single assistant turn conditioned on the caller's organization, user profile, and engagement state.
Headers
Content-TyperequiredRequest body must be JSON.
CookierequiredNextAuth session cookie for the signed-in user. Browsers include this automatically after sign-in.
X-Request-IdOptional. Echo back any client-provided request ID into logs; otherwise the server issues a ULID and returns it on the response.
Request body
messagesrequiredNon-empty array of Anthropic message parameters. The final element must have role "user". Prior turns may be included for client-side replay; the server does not reconstruct history from the database today.
conversationIdOptional. When provided and valid (owned by the caller), the turn is appended to that conversation. When omitted or invalid, the server creates a new conversation and returns its ID on the X-Conversation-Id response header.
Example request
curl -N -X POST https://app.corvanahq.com/api/chat \
-H "Content-Type: application/json" \
-H "Cookie: __Secure-authjs.session-token=…" \
-d '{
"messages": [
{ "role": "user", "content": "What is our largest unresolved lever?" }
]
}'Example response
200 OK — streamedHello Margaret. The largest unresolved lever you surfaced last week was the sales-to-ops handoff …
The second ranked lever is the Manufacturing Ops briefing cadence …
Error codes
| Code | Status | Description |
|---|---|---|
INVALID_JSON | 400 | Request body could not be parsed. |
MISSING_MESSAGES | 400 | `messages` missing, empty, or malformed. |
LAST_MESSAGE_NOT_USER | 400 | Final message must have role "user". |
UNAUTHENTICATED | 401 | No valid session cookie. |
NOT_ONBOARDED | 403 | Signed in but onboarding incomplete. |
RATE_LIMITED | 429 | 60 requests per 60-second sliding window per user. |
UPSTREAM_OVERLOADED | 503 | Anthropic overloaded — retry with backoff. |
UPSTREAM_AUTH | 500 | Model-provider auth misconfigured; page on-call. |
UPSTREAM_ERROR | 502 | Other model-side error. |
INTERNAL | 500 | Uncaught; logged with the request ID. |
Response headers
Every successful response sets the following headers:
Content-Type: text/plain; charset=utf-8Cache-Control: no-cache, no-transform— proxies must not buffer or transform.X-Accel-Buffering: no— disables intermediate proxy buffering.X-Conversation-Id: <cuid>— the ID of the conversation this turn was appended to.X-Prompt-Version: 1.0.0— the composed system-prompt schema version.X-Request-Id: <ulid>— echoed or issued.
Current streaming shape
The current implementation streams the assistant reply as raw text chunks, one per model text_delta. There is no SSE framing. Errors that occur mid-stream are appended as plain text and the connection closes normally. Clients should render chunks as they arrive and treat a non-2xx status as a terminal error.
data frame (chunked, text/plain):
chunk 1 "Hello Margaret. The largest "
chunk 2 "unresolved lever you surfaced last "
chunk 3 "week was the sales-to-ops handoff …"
…SSE upgrade (v1.1)
A typed SSE upgrade is on the near-term roadmap. The body will be served as text/event-stream with text, meta, and done events. The upgrade is additive — clients that continue to read plain-text chunks will keep working during the migration window.
event: meta
data: {"conversationId":"cmvx…","promptVersion":"1.0.0","requestId":"req_01HW…"}
event: text
data: {"delta":"Hello Margaret."}
event: text
data: {"delta":" The largest unresolved lever …"}
event: done
data: {"stopReason":"end_turn","usage":{"inputTokens":1820,"outputTokens":342,"cacheReadTokens":1600,"cacheCreateTokens":0}}Persistence side effects
The server writes two rows to Postgres per turn:
- On request receipt — the user message is written with
role: "USER",orgId, anduserId. The conversation is created ifconversationIdis absent or invalid. - On stream close — the assistant message is written with
role: "ASSISTANT", the full token accounting (inputTokens,outputTokens,cacheReadTokens,cacheCreateTokens),stopReason,model, andpromptVersion.Conversation.lastMessageAtis bumped.
Every row is scoped by orgId — multi-tenant isolation is enforced at the query layer, not by application-level checks.
Engagement state
Before the model call, the server loads the caller's EngagementState row (if one exists) and composes it into the L6 layer of the system prompt. After the stream closes, a planned post-turn reflection job will promote or archive hypotheses, record new findings, and advance the week counter. The reflection job is not yet shipped; today, engagement state is read-only from the client's perspective and mutated by direct writes.
Rate limits
Live chat is rate-limited to 60 requests per 60-second sliding window per user. The limit is enforced against the userId derived from the session, so it survives client IP rotation. On rejection the server responds 429 with a plain-text body and a Retry-After header. See Rate limits for the full policy.
Next: Demo chat (unauthenticated public endpoint) or Errors.