Streaming

Amy streams every step of a turn, routing, agent thoughts, validator verdicts, the synthesised answer, over Server-Sent Events. One HTTP connection, one durable replay buffer, no websockets require…

Amy streams every step of a turn, routing, agent thoughts, validator verdicts, the synthesised answer, over Server-Sent Events. One HTTP connection, one durable replay buffer, no websockets required.

The CLI's "watch Amy think" UX exists because every client subscribes to the same SSE channel. This page is the protocol-level reference: the wire format, every event type, reconnects, and how to consume from curl, the browser, React Native, and Swift.

When to use streaming
Connection setup
The wire format
Event catalog
Reconnects and replay
Backpressure
Sample raw transcript
Consuming the stream
Common mistakes

When to use streaming

Subscribe to /events whenever a human is waiting on the answer. Amy turns run for 30s-7min depending on routing; a black-box wait erodes trust. Streaming lets you render:

Routing, so the user knows which specialists are working.
Progress phases, so the answer feels alive as the agents work.
Validator verdicts, so the user sees that numbers were checked (this is load-bearing for trust, see the README's Validator section).
The final synthesis, token by token.

When not to stream: batch jobs, retries, cron-driven workflows. For these, poll GET /v1/turns/:id until status is completed (the result is then populated). Every POST /v1/turns runs asynchronously today — a synchronous blocking mode ("stream": false) is planned but not yet wired. See Turns.

Connection setup

Streaming is one endpoint:

GET /v1/turns/:id/events

Required headers

GET /v1/turns/turn_01HX2K3M4N5P6Q7R8S9T0V1W2X/events HTTP/1.1
Host: amy.heyamy.xyz
Accept: text/event-stream
Authorization: Bearer <clerk-jwt-or-amy-cli-jwt>

Header	Required?	Notes
`Accept`	yes	Must include `text/event-stream`. Sending `application/json` returns the snapshot Turn object instead.
`Authorization`	yes	Same bearer token as the rest of the API.
`Last-Event-Id`	optional	On reconnect, set this to the last `id:` you observed. See Reconnects and replay.

Response headers

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
X-Request-Id: req_01HX2K3M4N5P6Q7R8S9T0V1W2X

X-Request-Id is per-connection (not per-event). Include it when reporting stuck streams.

You can open the stream before, during, or after the turn, even the SSE for an already-completed turn will replay from the buffer (within the 1-hour window; see Reconnects and replay).

Opening twice in parallel is supported but pointless: each connection gets the same events.

The wire format

Standard SSE per the WHATWG spec. Each event is three lines:

event: <event-type>
id: <monotonic-integer>
data: <single-line JSON>

(blank line ends the event)

Constraints Amy enforces:

id is a monotonically-increasing integer per turn, starting at 1. Gaps are normal — id tracks an internal event sequence, and not every sequence number maps to a wire frame. The terminal (turn.completed / turn.failed) frame carries a large sentinel id (9007199254740991), so don't assume the last id is small or contiguous. Use whatever id you last saw for Last-Event-Id on reconnect.
event is one of the types in the Event catalog.
data is always one line of JSON. We never split a JSON object across multiple data: lines, so consumers can JSON.parse(line) directly.
Comments (: lines) appear in three places: : connected is sent immediately to force Cloudflare's edge proxy to flush the response headers; : heartbeat <timestamp> is sent every ~5s while the workflow is idle so dropped connections surface quickly; : stream-end is sent right before the server closes. Drop them silently — that's what the SSE spec mandates.

There is no chunked retry directive (retry:). Clients implement their own reconnect with Last-Event-Id; the SDK does not auto-retry in v1, it surfaces stream_closed_unexpectedly on disconnect and the caller decides what to do (see Reconnects).

Event catalog

Every event is one of these types. Listed in roughly the order they appear in a typical turn.

`turn.started`

Fires once, when the workflow's first step begins.

{
  "turn_id": "turn_01HX2K3M4N5P6Q7R8S9T0V1W2X",
  "at": "2026-05-25T10:00:00.123Z"
}

Field	Type	Notes
`turn_id`	`string`	The turn this event belongs to. Same as the URL path.
`at`	ISO-8601	When step 1 actually began (not when the POST landed).

`phase`

The cheapest, most-used event. Fires whenever a step starts, makes progress, or moves on. Render these as a stack of progress markers in your UI.

{
  "agent": "orchestrator",
  "phase": "vagueness = low",
  "detail": "specific — direct routing"
}

Field	Type	Notes
`agent`	enum	`orchestrator` · `Data Science Agent` · `Domain Expert Agent` · `Health Coach Agent` · `Hypothesis Investigator` · `validator` · `memory`.
`phase`	`string`	Short label, e.g. `"classifying query vagueness"`, `"executing python (17 lines)"`.
`detail`	`string?`	Optional sub-text rendered under the phase.

`routing`

Fires once after the orchestrator picks the collaboration pattern. The payload tells you which agents will run.

{
  "decision": {
    "main_agent": "Data Science Agent",
    "supporting_agents": "Domain Expert Agent",
    "collaboration_workflow": "DS computes the user's average HRV…",
    "investigator": false,
    "vagueness": "low"
  }
}

`rephrase`

Fires once after the orchestrator decomposes the user's query into per-agent sub-questions.

{
  "main": { "agent": "Data Science Agent", "question": "What is my average HRV…" },
  "supporting": {
    "Data Science Agent": "Compute the user's average HRV…",
    "Domain Expert Agent": "Given a user's computed average HRV…"
  }
}

`agent_start`

Fires when each sub-agent begins. Supporting agents run sequentially in v1, so you typically see one agent_start matched by its agent_end before the next pair begins.

{
  "agent": "Data Science Agent",
  "question": "Compute the user's average HRV…"
}

Field	Type	Notes
`agent`	enum	Same set as `phase`.
`question`	`string`	The rephrased sub-question for this agent (not the user's original).

`agent_end`

Fires when each sub-agent finishes.

{
  "agent": "Data Science Agent",
  "cost_usd": 0.0895,
  "duration_ms": 98400,
  "preview": "Average HRV 47.2 ms over the last 30 days; trend slightly upward…"
}

Field	Type	Notes
`agent`	enum
`cost_usd`	`number`	LLM spend for this agent only.
`duration_ms`	`number`	Wall time for this agent.
`preview`	`string?`	First ~160 chars of the agent's output for log display. Not load-bearing.

`cost`

Granular spend events. One per LLM call charged to the turn. Sum these to derive result.cost_usd (or just read result.cost_usd after completion).

{ "agent": "Domain Expert Agent", "usd": 0.034, "reason": "react_step" }

`validation_start` / `validation_end`

Fires once per finding entering / leaving the validator. The deterministic gates (sample size, effect-vs-noise, bootstrap, subgroup consistency, method triangulation, discriminative power, construct validity) and the LLM Critic both contribute to the verdict.

// validation_start
{ "finding_id": "ds-001", "claim": "average HRV over the last 30 days" }

// validation_end
{
  "finding_id": "ds-001",
  "verdict": "validated",
  "gates_passed": 6,
  "gates_total": 7,
  "duration_ms": 8400,
  "reason": "All hard gates passed; Critic verdict=accept."
}

Field (validation_end)	Type	Notes
`verdict`	enum	`validated` · `conditional` · `rejected`. Rejected findings do not enter the fact sheet.
`gates_passed` / `gates_total`	`number`	How many deterministic gates passed.
`reason`	`string`	One-sentence summary, safe to surface in trace UIs.

These events only fire on turns that route through the validator — direct-data lookups and conversational fallbacks skip it.

`synthesis_delta`

Token-level streaming of the final answer. Fires many times during step 8 (synthesis). Concatenate text chunks in order to reconstruct the streamed answer.

{ "text": "Your average HRV " }

Field	Type	Notes
`text`	`string`	A token chunk. Append-only — never a `replace` or `insert`. Whitespace is significant; don't trim.

synthesis_delta only fires once synthesis (step 8) begins, after validation. For the Investigator path the briefing itself is the answer, and synthesis_delta is replaced by a single agent_end.

`fact_check`

Fires at most once after synthesis. Post-synthesis regex scan flagged a numeric token in the rendered reply that doesn't match the fact sheet (within 2% relative / 0.05 absolute tolerance).

{
  "issues": [
    { "value": "47.2", "expected_keys": ["ds-001.mean"], "severity": "warn" }
  ]
}

Fact-check issues are advisory: they're logged and surfaced in trace UIs, and the turn still completes. The orchestrator's deterministic fact-check also self-corrects synthesis once if it drifts from the fact sheet before the answer is finalized.

`memory`

Fires once after memory extraction (step 9). Lists durable facts written to the user's memory store for use on future turns.

{ "entries": [{ "ts": "2026-05-25T10:01:40.473Z", "agent": "user", "kind": "preference", "text": "Plays cricket; season runs Mar–Sep." }] }

Each entry matches the MemoryEntry shape: kind (the category), ts, agent, text, plus an optional mem_-prefixed id and confidence.

`turn.completed`

Fires once, last, on success. After this, no more events arrive on the stream, the server closes the connection.

{
  "turn_id": "turn_01HX2K3M4N5P6Q7R8S9T0V1W2X",
  "result": {
    "answer": "Short answer: no — by the most defensible read of your own data...",
    "fact_sheet": [
      { "key": "ds-001.effect", "value": -0.374, "unit": null,
        "source": "data_science", "n": 87, "window": "last 90 days" }
    ],
    "agents_used": ["data_science", "domain_expert"],
    "cost_usd": 0.1288,
    "duration_ms": 222000
  }
}

The result payload is identical to Turn.result from GET /v1/turns/:id — exactly answer, fact_sheet, agents_used, cost_usd, duration_ms. You don't need a second fetch after the stream closes. (Validator verdicts are streamed live as validation_end events during the turn; they are not re-summarized in result.)

`turn.failed`

Fires once, last, on failure. Mutually exclusive with turn.completed.

{
  "turn_id": "turn_01HX2K3M4N5P6Q7R8S9T0V1W2X",
  "error": {
    "code": "workflow_step_failed",
    "message": "Anthropic API returned 529 after 3 retries.",
    "step": "run_turn"
  }
}

The error here is the Turn.error shape — code, message, and an optional step naming the pipeline stage that failed (e.g. run_turn). Unlike the HTTP error envelope, it does not carry request_id / docs_url. Treat turn.failed the same way you'd treat a non-2xx response. See Errors for the recovery matrix.

Reconnects and replay

SSE connections drop, phones lose signal, browser tabs sleep, intermediaries time out. Amy's stream is designed to survive this without losing events.

How it works

Every event is buffered to KV under stream:<turn_id>:<seq> before it's sent. The buffer is retained for 1 hour after completed_at.

On reconnect, set Last-Event-Id: <last_seq_you_saw>:

GET /v1/turns/turn_01HX.../events HTTP/1.1
Accept: text/event-stream
Authorization: Bearer …
Last-Event-Id: 43

The server replays from seq=44 forward. If the turn is still running, new events follow. If the turn is already complete, you'll get the remaining buffered events and then a graceful close.

What the SDK does

The TypeScript SDK does not auto-reconnect in v1 — it throws AmyApiError with code: "stream_closed_unexpectedly" on disconnect and you decide what to do. The typical retry loop:

import { AmyApiError } from "@amy/sdk";

let lastId: string | number | null = null;
let attempt = 0;

while (attempt < 5) {
  try {
    for await (const event of amy.turns.stream(turn.id, { lastEventId: lastId })) {
      lastId = event.id ?? lastId;
      handle(event);
      if (event.type === "turn.completed" || event.type === "turn.failed") return;
    }
    return; // ended cleanly
  } catch (err) {
    if (err instanceof AmyApiError && err.code === "stream_closed_unexpectedly") {
      attempt++;
      const backoff = Math.min(30_000, 500 * 2 ** attempt) * (0.5 + Math.random());
      await new Promise((r) => setTimeout(r, backoff));
      continue;
    }
    throw err;
  }
}

Each reconnect re-resolves the bearer via apiKeyProvider, so a Clerk session that rotated during the gap picks up a fresh token. Auto-reconnect with built-in backoff is on the SDK roadmap.

When replay is no longer available

After 1 hour past completed_at, the KV buffer is garbage-collected. Reconnecting then still returns 200, but with nothing left to replay — the stream opens and closes immediately. The Turn row itself is permanent, so GET /v1/turns/:id still works forever.

If you need long-term event history, persist events client-side as they arrive. The CLI does this for the /trace command.

Backpressure

SSE has no flow control. If your client can't keep up with the event rate, events queue in the HTTP buffer and your reader falls behind

eventually the connection's TCP send buffer fills and the server blocks on write().

This rarely matters in practice, Amy emits ~50-300 events per turn, spaced over minutes. But two cases can bite:

You're rendering each synthesis_delta into a DOM mutation synchronously. Browsers can choke at 100+ DOM ops/sec. Buffer tokens for a 16ms frame and flush in requestAnimationFrame.
You're piping events into a slow downstream sink (a database write per event). Batch the writes, the SDK iterator has a built-in queue, but a downstream blocking call will still back-pressure through it.

For server-side consumers, run the event-loop on a separate async task from the sink-write loop. The SDK iterator is non-blocking between events.

Sample raw transcript

A real turn, abridged. Newlines preserved exactly as they appear on the wire.

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
X-Request-Id: req_01HX2K3M4N5P6Q7R8S9T0V1W2X

: connected

event: turn.started
id: 1
data: {"type":"turn.started","seq":1,"at":"2026-05-25T10:00:00.123Z","turn_id":"turn_01HX2K3M4N5P6Q7R8S9T0V1W2X"}

event: phase
id: 2
data: {"type":"phase","agent":"orchestrator","phase":"classifying query vagueness"}

event: phase
id: 3
data: {"type":"phase","agent":"orchestrator","phase":"vagueness = low","detail":"specific — direct routing"}

event: routing
id: 5
data: {"type":"routing","decision":{"main_agent":"Data Science Agent","supporting_agents":"Domain Expert Agent","collaboration_workflow":"DS computes the user's average HRV…","investigator":false,"vagueness":"low"}}

event: rephrase
id: 7
data: {"type":"rephrase","main":{"agent":"Data Science Agent","question":"What is my average HRV…"},"supporting":{"Data Science Agent":"Compute the user's average HRV…","Domain Expert Agent":"Given a user's computed average HRV…"}}

event: agent_start
id: 8
data: {"type":"agent_start","agent":"Data Science Agent","question":"Compute the user's average HRV…"}

event: agent_end
id: 19
data: {"type":"agent_end","agent":"Data Science Agent","cost_usd":0.089,"duration_ms":98400,"preview":"Average HRV 47.2 ms over last 30 days…"}

event: cost
id: 20
data: {"type":"cost","agent":"Data Science Agent","usd":0.089,"reason":"react_step"}

event: synthesis_delta
id: 56
data: {"type":"synthesis_delta","text":"Your 30-day HRV picture "}

event: turn.completed
id: 87
data: {"type":"turn.completed","turn_id":"turn_01HX2K3M4N5P6Q7R8S9T0V1W2X","result":{"answer":"Your 30-day HRV picture has a clear story…","fact_sheet":[],"agents_used":["orchestrator","data_science","domain_expert"],"cost_usd":0.364,"duration_ms":107185}}

Note: this transcript is from a turn whose computed numbers didn't pass the validator into the fact sheet, so fact_sheet is empty. Turns that route through the validator emit validation_start / validation_end between agent_end and synthesis_delta, and the resulting result includes a populated fact_sheet. The id jumps (1, 2, 3, 5, 7, …) are expected — see The wire format.

Consuming the stream

curl

curl -N \
  -H "Authorization: Bearer $AMY_API_KEY" \
  -H "Accept: text/event-stream" \
  https://amy.heyamy.xyz/v1/turns/turn_01HX.../events

-N disables buffering. Without it curl will hold output until the connection closes.

To resume from a known event ID:

curl -N \
  -H "Authorization: Bearer $AMY_API_KEY" \
  -H "Accept: text/event-stream" \
  -H "Last-Event-Id: 43" \
  https://amy.heyamy.xyz/v1/turns/turn_01HX.../events

Browser (EventSource)

Native EventSource is the simplest path. It handles Last-Event-Id automatically.

// EventSource cannot set custom headers, and Amy does NOT accept a
// query-string API key — auth is Bearer-only. So point EventSource at
// your own backend proxy, which adds the Authorization header and
// forwards the upstream SSE byte-for-byte.
const es = new EventSource(`/api/amy/turns/${turnId}/events`);

es.addEventListener("turn.started", (e) => {
  console.log("Started:", JSON.parse(e.data));
});

es.addEventListener("phase", (e) => {
  const { agent, phase, detail } = JSON.parse(e.data);
  setStatus(`${agent} → ${phase}${detail ? ` (${detail})` : ""}`);
});

es.addEventListener("synthesis_delta", (e) => {
  const { text } = JSON.parse(e.data);
  document.getElementById("answer").textContent += text;
});

es.addEventListener("turn.completed", (e) => {
  const { result } = JSON.parse(e.data);
  render(result.answer);
  es.close();
});

es.addEventListener("turn.failed", (e) => {
  const { error } = JSON.parse(e.data);
  console.error(error.code, error.message);
  es.close();
});

Important: browser EventSource cannot set the Authorization header. Two options:

Proxy through your backend; have it add the bearer token.
Cookie auth, out of scope for v1; we don't issue cookies.

For production browser use, proxy. The SDK does this transparently in React/Vue if you set baseUrl to your own proxy.

React Native

No EventSource polyfill needed — the SDK's iterator uses fetch + ReadableStream, which Hermes supports in Expo SDK 53+. The same code as browser/Node:

import { useAuth } from "@clerk/expo";
import { Amy } from "@amy/sdk";

const { getToken } = useAuth();
const amy = new Amy({ apiKeyProvider: () => getToken() });

for await (const event of amy.turns.stream(turnId)) {
  if (event.type === "synthesis_delta") {
    setAnswer((prev) => prev + event.data.text);
  }
  if (event.type === "turn.completed") {
    setAnswer(event.data.result.answer);
  }
}

Clerk's getToken() is called once when the stream opens; for very long turns that cross a Clerk session expiry, see the reconnect loop above — each reconnect re-resolves the token.

Swift (URLSession)

There's no built-in EventSource in Foundation, but the bytes-stream API is enough:

let url = URL(string: "https://amy.heyamy.xyz/v1/turns/\(turnId)/events")!
var req = URLRequest(url: url)
req.setValue("text/event-stream",          forHTTPHeaderField: "Accept")
req.setValue("Bearer \(apiKey)",           forHTTPHeaderField: "Authorization")
req.setValue(String(lastEventId),          forHTTPHeaderField: "Last-Event-Id")

let (bytes, _) = try await URLSession.shared.bytes(for: req)

var currentEvent: (id: String?, event: String?, data: String?) = (nil, nil, nil)
for try await line in bytes.lines {
  if line.isEmpty {
    // dispatch event
    if let type = currentEvent.event, let data = currentEvent.data {
      handle(type: type, data: Data(data.utf8))
    }
    currentEvent = (nil, nil, nil)
    continue
  }
  if line.hasPrefix("id: ")    { currentEvent.id    = String(line.dropFirst(4)) }
  if line.hasPrefix("event: ") { currentEvent.event = String(line.dropFirst(7)) }
  if line.hasPrefix("data: ")  { currentEvent.data  = String(line.dropFirst(6)) }
}

The forthcoming Swift SDK wraps this with typed handlers; until then, the snippet above is what it does under the hood.

Common mistakes

Connecting without `Accept: text/event-stream`

You'll get the JSON snapshot of the Turn instead of the stream. The endpoint content-negotiates: SSE if Accept matches, JSON otherwise.

Treating `synthesis_delta` `text` as a full message

It's a chunk. Concatenate text for each synthesis_delta to reconstruct the streamed answer. Whitespace is significant, don't trim.

Reconnecting without `Last-Event-Id`

You'll receive duplicate events from the start of the buffer. Always track the last id: you observed and pass it on reconnect.

Reconnecting more than 1 hour after `completed_at`

The replay buffer is GC'd, so there's nothing left to stream (you get a 200 that opens and closes with no events). Read the terminal state with GET /v1/turns/:id instead — it's permanent.

Splitting JSON across multiple `data:` lines

We don't, and your parser shouldn't expect it. Each data: line is a complete JSON document. If you see a parse error, it's a bug, report with the request ID.

Letting `EventSource` reconnect on its own

The browser EventSource reconnects automatically but uses a 3-second linear delay with no jitter and no cap, and it can't send the Authorization header on reconnect. For a long-running turn over a flaky network, prefer the SDK's amy.turns.stream and drive reconnects yourself with the Last-Event-Id backoff loop shown in Reconnects and replay — the SDK surfaces stream_closed_unexpectedly rather than retrying for you.

Turns, what triggers the events you're streaming.
Errors, what each turn.failed code means.
API reference: Streaming, the endpoint signature.
SDK: TypeScript, streaming, typed iterator, reconnection, cancellation.
Internals: Runtime, how the KV-backed event buffer works underneath.

On this page