Turns

A turn is one round-trip of the agent: the user asks, Amy answers, and every quantitative claim in the answer has been checked. It's the unit of work the backend dispatches, persists, streams, and bi…

A turn is one round-trip of the agent: the user asks, Amy answers, and every quantitative claim in the answer has been checked. It's the unit of work the backend dispatches, persists, streams, and bills.

A turn is what you POST to start, what you GET to read, and what you subscribe to with SSE. If you understand turns, you understand 80% of the API.

What a turn is
The lifecycle
The pipeline (high level)
The Turn object
Patterns
Cost expectations
What "completed" guarantees
Common mistakes

What a turn is

   user message ──► [ turn_01HX... ] ──► answer  +  fact_sheet  +  trace

A turn carries one user message into the multi-agent pipeline and emerges with:

a plain-language answer (result.answer),
a fact sheet, every number the answer cites, with provenance (result.fact_sheet),
a trace of which agents ran, which validation gates fired, and what each cost,
new memory entries extracted from the exchange.

A turn is a Cloudflare Workflow under the hood. That means:

Durable. A worker restart, an Anthropic 5xx, or a code deploy mid-turn doesn't lose the turn, it resumes at the last successful step.
Observable. Every step's input/output is visible in the CF dashboard.
Replayable. Same inputs → same step decomposition → easy debugging.

You always reference a turn by its ID:

turn_01HX2K3M4N5P6Q7R8S9T0V1W2X

That ID is stable forever. The Turn row is kept indefinitely; the SSE event replay buffer is kept for 1 hour after completion (see Streaming).

The lifecycle

A turn moves through four statuses. Three are terminal-or-progressing; one (failed) is terminal.

Status	Meaning	Has `result`?	Has `error`?
`queued`	The turn row exists; the workflow hasn't started its first step yet. Usually `<1s.`	no	no
`running`	One of the 9 pipeline steps is executing. May last 2-7 minutes.	no	no
`completed`	Synthesis finished and numeric verification passed. `result` is populated.	yes	no
`failed`	An unrecoverable error occurred (validation runner crashed, upstream timed out past the retry budget, the model returned a malformed plan after the cap).	no	yes

You will almost always see running for the bulk of a turn's life. Subscribe to /events to watch it progress; poll GET /v1/turns/:id if you only need the terminal state.

A turn cannot transition backwards. There is no paused state in v1, future versions may add human-in-the-loop pauses.

The pipeline (high level)

Internally, a turn decomposes into up to 9 steps. Each step is a discrete workflow operation: its output persists, and if step N fails it resumes at step N, steps 1 through N−1 do not replay.

 1  classify_vagueness     ~3s     pick the right path
 2  route                  ~2s     main + supporting agents
 3  rephrase_per_agent     ~2s     focused sub-questions
 4  run_supporting_agents  ~30–90s each (parallel)
 5  run_main_agent         ~30–120s
 6  reflection             ~5s     do we need follow-ups?
 7  validation_gates       ~10s    7 gates + Critic + Assessment
 8  synthesis              ~20s    streams to /events
 9  memory_extraction      ~3s     durable facts written

Two pre-step branches matter:

Vagueness = high ("is anything interesting in my data?") skips the standard route and runs the Hypothesis Investigator first. The Investigator scans the user's data, ranks candidate hypotheses, and feeds the top-K through the validator before answering.
Routing returns no agent triggers a plain conversational fallback , one cheap LLM call, no pipeline, no validator. (For "thanks!" and similar.)

For the full step-by-step contract, what each step reads, writes, and guarantees, see Internals: Agent orchestration.

Why this matters for clients

You don't need to call any of these directly. But understanding the shape explains things you'll observe:

Why does synthesis_delta start streaming ~60 seconds in? Steps 1-3 run first (routing + rephrase), then a sub-agent has to finish (agent_end), then validation runs (step 7) before synthesis (step 8) emits the first synthesis_delta.
Why is the answer short for a vague question, then suddenly long? High vagueness routes through the Investigator first, its briefing is the answer for that turn (steps 5-8 don't run).
Why does validator output appear before the final answer? Step 7 always runs before step 8. Synthesis is constrained to the fact sheet, which doesn't exist until validation completes.

The Turn object

The full schema, lifted from the API reference:

{
  "id": "turn_01HX2K3M4N5P6Q7R8S9T0V1W2X",
  "status": "completed",
  "created_at": "2026-05-25T10:00:00Z",
  "completed_at": "2026-05-25T10:03:42Z",
  "messages": [
    { "role": "user", "content": "Is my sleep score drop meaningful?" }
  ],
  "result": {
    "answer": "Short answer: no — by the most defensible read of your own data, this isn't a clinically meaningful decline...",
    "fact_sheet": [
      { "key": "ds-001.effect", "value": -0.374, "unit": null,
        "source": "data_science", "n": 87, "window": "last 90 days" },
      { "key": "ds-001.ci_low", "value": -0.42, "unit": null,
        "source": "data_science", "n": 87, "window": "last 90 days" }
    ],
    "agents_used": ["orchestrator", "data_science", "domain_expert", "validator"],
    "cost_usd": 0.1288,
    "duration_ms": 222000
  },
  "error": null
}

Field	Type	Notes
`id`	`string`	Typed prefix `turn_…` followed by a random hex id. Treat it as opaque — it is not time-sortable; order by `created_at`.
`status`	enum	`queued` · `running` · `completed` · `failed`.
`created_at`	ISO-8601	When the POST landed.
`completed_at`	ISO-8601 \| null	Set when status flips to `completed` or `failed`.
`messages`	`Message[]`	The exact conversation you sent. Last message is always from `user`.
`result.answer`	`string`	Markdown. May be multi-paragraph. Safe to render directly.
`result.fact_sheet`	`FactSheetEntry[]`	Every number the answer cites, with provenance. Each entry's `key` is the stable identifier (`<finding_id>.<key>`). See What "completed" guarantees.
`result.agents_used`	`string[]`	The agents that ran, e.g. `orchestrator`, `data_science`, `domain_expert`, `health_coach`, `investigator`, `validator`.
`result.cost_usd`	`number`	Total LLM spend across all steps.
`result.duration_ms`	`number`	Wall time from queue to completion.
`error`	object \| null	Populated when `status === "failed"`: `{ code, message, step? }`. See Errors.

result is null while status is not completed. error is non-null only when status === "failed".

The `Message` shape

{ "role": "user" | "assistant", "content": "string" }

content is plain text in v1. Future versions may add tool calls, attachments, and structured content; new fields will be additive.

Patterns

One-shot question

The simplest possible turn. No history, no extras.

curl -X POST https://amy.heyamy.xyz/v1/turns \
  -H "Authorization: Bearer $AMY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{ "role": "user", "content": "What is my average HRV this month?" }]
  }'

The POST returns 202 immediately with { id, status: "queued", stream_url } — the turn runs asynchronously. For a non-interactive caller, poll GET /v1/turns/:id until status is completed, then read result:

curl https://amy.heyamy.xyz/v1/turns/$TURN_ID \
  -H "Authorization: Bearer $AMY_API_KEY"

Use polling for batch jobs and scripts; subscribe to /events whenever a human is waiting. (A synchronous blocking mode is planned — see Streaming vs polling.)

Conversational

Pass the entire conversation in messages on each turn. Amy doesn't maintain server-side conversation state, you do.

const history = [
  { role: "user",      content: "How was my sleep last week?" },
  { role: "assistant", content: "Average 7.2h, REM was lower than your..." },
  { role: "user",      content: "Why might REM have dropped?" },
];

const turn = await amy.turns.create({ messages: history });

The last message must be from user. Sending an assistant message last returns 400 invalid_request.

Memory (next section) is separate from conversation: memory persists across turns automatically; messages do not.

With / without memory injection

By default every turn injects the user's persistent memory (goals, preferences, prior insights) into the agent context. To opt out:

const turn = await amy.turns.create({
  messages: [...],
  context: { include_memory: false }
});

When to skip memory injection:

Reason	Example
You're running an evaluation and need isolated turns.	Regression suite over a frozen agent.
The question is intentionally context-free.	"Explain HRV in 3 sentences."
You're testing a memory-extraction change.	Want to see what Amy would remember without re-using prior memory.

The flip side, include_biomarkers: false, drops the latest lab snapshot from context. Both default to true.

Streaming vs polling

Every turn runs asynchronously: create() returns a queued turn, and you choose how to wait for it.

Stream whenever a user is watching:

const turn = await amy.turns.create({ messages: [...] });  // → 202, status "queued"

for await (const event of amy.turns.stream(turn.id)) {
  if (event.type === "synthesis_delta") render(event.data.text);
  if (event.type === "turn.completed")  finish(event.data.result);
}

Poll when nothing's watching:

const { id } = await amy.turns.create({ messages: [...] });
let turn = await amy.turns.retrieve(id);
while (turn.status === "queued" || turn.status === "running") {
  await new Promise((r) => setTimeout(r, 2000));
  turn = await amy.turns.retrieve(id);
}
// turn.status === "completed", turn.result is populated.

A synchronous blocking mode (POST with "stream": false that holds the connection until the turn completes) is reserved in the request schema but not yet implemented — today the field is ignored and every POST returns 202.

See Streaming for the full SSE protocol.

Cost expectations

Turns vary by routing. Costs measured on Claude Sonnet for routing/reflection and Opus for the heavy agents. Your cost_usd will land in these bands for a typical user with 60-180 days of data.

Turn shape	Typical `duration_ms`	Typical `cost_usd`
Conversational fallback ("hi", "thanks!")	10,000-15,000	$0.005-$0.015
Direct data lookup ("avg RHR?")	60,000-90,000	$0.20-$0.40
Data + domain ("is LDL 124 a concern?")	90,000-150,000	$0.30-$0.55
Coaching ("plan my deep sleep recovery")	120,000-180,000	$0.40-$0.70
Investigator (vague: "anything to worry about?")	60,000-120,000	$0.05-$0.15
Reflection follow-ups added	+30,000-90,000	+$0.05-$0.20

Ranges measured against the live amy.heyamy.xyz deployment with a real Whoop + bloodwork user; your numbers will drift if you change AMY_MODEL / AMY_FAST_MODEL or run on a fixture user.

A cost_warning event fires at AMY_COST_WARN_USD (default $0.50) but the turn does not abort, the warning is advisory. Set per-user caps server-side if you need a hard ceiling.

Concurrency cap is 20 in-flight turns per user. The queue absorbs bursts at max_concurrency: 1, so this is a gentle guard against a stuck client flooding the queue rather than a hard rate limit. Exceeding it returns 429 concurrency_limit_exceeded (see Errors).

What "completed" guarantees

When status === "completed", the following invariants hold:

result.answer is non-empty. Synthesis ran and emitted text.
agents_used is the actual set of agents the orchestrator ran. Not the candidate set, the executed set.
cost_usd is the sum of every LLM call charged to this turn, including failed retries. Use this for per-user billing or per-feature accounting.
duration_ms is wall time from the workflow's first step beginning to the final memory write completing. It excludes queue time (queue → first step is typically <1s but can spike).

When `fact_sheet` is populated

result.fact_sheet carries the numbers that survived validation — but the validator only gates findings on routes that go through it. That includes the Investigator path (high-vagueness queries) and any finding the supporting agents explicitly emit. Conversational fallbacks bypass validation entirely, and a direct-data lookup whose computed numbers don't pass the gates lands with an empty sheet, so for those turns:

result.fact_sheet is []

(Live validator verdicts stream as validation_end events during the turn — they are not summarized into result.)

When the validator did run and findings passed, these stronger invariants hold:

Every quantitative claim in result.answer traces to result.fact_sheet (within 2% relative / 0.05 absolute tolerance). A post-synthesis regex scan emits a fact_check event for any mismatch.
Every fact sheet entry survived validation. Critic-rejected findings do not enter the fact sheet. Conditional findings carry verdict: "conditional" and the answer is required to hedge.

The flip side: if status === "failed", none of these hold. result is null; error.code tells you why. See Errors.

Common mistakes

Polling `GET /v1/turns/:id` instead of streaming

You'll pay for the latency twice: once in your poll loop, once in the KV-backed event buffer that exists anyway. Subscribe to /events once; the iterator handles reconnects for you. Polling is only for non-interactive use (cron jobs, retries after disconnect).

Treating `queued` or `running` as terminal

Both are transient. If you read a turn and status !== "completed" && status !== "failed", the work isn't done. result is null by design, not by bug.

Sending an assistant message last

messages[messages.length - 1].role must be "user". Sending an assistant message last looks like you're trying to make Amy continue its own thought, which isn't supported in v1. Returns 400 invalid_request.

Expecting deterministic results between turns

Two identical messages arrays will not produce byte-identical answers. The LLM is stochastic; the validator is deterministic but operates on LLM-emitted findings that vary by run. Use the fact sheet for deterministic claim-level comparisons, not the answer text.

Forgetting that conversation state is client-side

Amy does not store conversations. The Turn row stores the messages you sent for that turn, nothing more. If you want a multi-turn chat, your client maintains the array.

Reading `result.fact_sheet.length` to gauge confidence

A short fact sheet can mean a focused, high-confidence answer (one metric, one validated number) or a low-confidence answer where nothing survived validation. Judge confidence from the answer's own hedging and the live validation_end verdicts, not from fact_sheet.length.

Assuming the SSE replay window is forever

It's 1 hour after completed_at in v1. After that, the events are GC'd from KV. The Turn row stays forever; only the live event stream expires. Persist events client-side if you need them long-term.

Cancelling a turn

There's no DELETE /v1/turns/:id for in-flight turns in v1. Once a turn is queued, it runs to completion or failure. You can close your SSE subscription, but the backend keeps working (and keeps charging). Use the per-user concurrency cap (20) as your back-pressure mechanism.

Where to next

Streaming, the SSE event catalog and reconnect protocol.
Memory, what Amy remembers across turns.
Errors, every error code and how to recover.
API reference: Turns, the full endpoint reference.
Internals: Agent orchestration, the per-step contract for the 9-step pipeline.
SDK: TypeScript, typed amy.turns methods.

Quick navigation