Streaming
The Chat API streaming endpoint (POST /chat/stream) uses Server-Sent Events (SSE) to deliver the agent's response incrementally.
SSE Event Format
Each event has the following structure:
event: new_message
id: {message_id}:{event_index}
data: {"sender": "bot", "content": "...", "content_parts": [...], ...}
retry: 15000
| Field | Description |
|---|---|
event | new_message for message updates; stream_status and error are control events (see below) |
id | Composite ID: {message_id}:{event_index} — the event_index is a sequential integer starting at 0 |
data | A complete ChatMessage JSON object |
retry | Reconnection interval hint in milliseconds |
Incremental Content Delivery
Each SSE event carries the complete message so far, not a delta. As the agent generates tokens:
- Early events have partial
contentand noevidences. - Intermediate events may contain
content_partswithtype: "tool"showing tool execution state. - The final event contains the fully formed response: complete
content,evidences, andcontent_parts.
This design means reconnecting clients can resume from the latest event without needing to reassemble deltas.
Basic Streaming Client
import json
import os
import requests
import sseclient
TENANT = "zetaalpha"
url = f"https://api.zeta-alpha.com/v0/service/chat/stream?tenant={TENANT}"
response = requests.post(
url,
headers={
"accept": "text/event-stream",
"Content-Type": "application/json",
"X-Auth": os.getenv("ZETA_ALPHA_API_KEY"),
},
json={
"agent_identifier": "custom_agent",
"conversation": [
{"sender": "user", "content": "What is RAG?"}
],
},
stream=True,
)
response.raise_for_status()
client = sseclient.SSEClient(response)
for event in client.events():
message = json.loads(event.data)
# Check for tool execution status
for part in message.get("content_parts") or []:
if part["type"] == "tool":
tool = part["tool"]
print(f"[{tool['status']}] {tool['name']}: {tool.get('display_text', '')}")
# Check for dynamically retrieved context
for part in message.get("content_parts") or []:
if part["type"] == "context" and part.get("context"):
print(f"Agent found context: {part['context']}")
# Final message
print(message["content"])
print(message.get("evidences"))
Discovering the current turn
When a client lands on or reopens a conversation without already following its stream, probe the conversation's slot to find out what — if anything — is happening before deciding what to do:
GET /chat/stream/status?session_id={session_id}&tenant={tenant}
| Response | Meaning | Do |
|---|---|---|
200 · status: "running" | A turn is being produced right now (possibly by another tab or server instance). | Attach: follow GET /chat/stream/{message_id} (see Reconnection). |
200 · status: "dead" | A turn was in flight but its producer stopped. | Resume it (stateful) or resend (stateless) — see Resuming an Interrupted Turn. |
200 · status: "errored" | The last turn failed. | Surface the failure; to try again, send a new turn (not resume=true). |
200 · status: "done" | The last turn finished and is persisted. | Load the transcript with GET /conversations/{session_id}. |
404 | No turn in flight and nothing recent to replay. | Load the conversation; start a new turn when the user sends one. |
Starting a turn while one is already running is refused: POST /chat/stream returns 409 Conflict with the occupying turn in the body (detail.message_id, which may be null if it has not been assigned yet). Attach to that turn instead of starting a second one.
Only dead is resumable. errored and gone are terminal — for errored the turn failed (retry with a new turn); for gone (see Stream Outcome) it succeeded (read it from the conversation). Never send resume=true for either.
Reconnection
If a client disconnects mid-stream (network issue, timeout), it can reconnect and resume from where it left off using the GET /chat/stream/{message_id} endpoint:
GET /chat/stream/{message_id}?tenant={tenant}&start_index={last_event_index + 1}
| Parameter | Description |
|---|---|
message_id | The message_id from the SSE event id field (the part before the colon) |
start_index | The event index to resume from (0-based). Pass last_received_index + 1 to avoid duplicates. |
Reconnection works across server instances: when the original in-memory buffer is gone, the stream is replayed from a durable recording of the turn — following it live while the turn is still being produced. Completed turns stay replayable for a limited time after they finish; once the recording expires, the endpoint returns 404.
Extracting the message_id
The SSE event id field has the format {message_id}:{event_index}. Parse the message_id from the first event:
event_id = event.id # e.g. "abc123:0"
message_id, event_index = event_id.rsplit(":", 1)
Reconnection Example
import json
import os
import requests
import sseclient
TENANT = "zetaalpha"
BASE_URL = "https://api.zeta-alpha.com/v0/service/chat/stream"
headers = {
"accept": "text/event-stream",
"Content-Type": "application/json",
"X-Auth": os.getenv("ZETA_ALPHA_API_KEY"),
}
# Resume from event index 5
message_id = "previously-received-message-id"
start_index = 5
response = requests.get(
f"{BASE_URL}/{message_id}?tenant={TENANT}&start_index={start_index}",
headers=headers,
stream=True,
)
response.raise_for_status()
client = sseclient.SSEClient(response)
for event in client.events():
message = json.loads(event.data)
print(message["content"])
Stream Outcome
A reconnected stream served from the recording ends with a final stream_status event telling the client how the turn ended:
event: stream_status
data: {"reason": "done"}
| Reason | Meaning | Client action |
|---|---|---|
done | The turn completed; the last new_message event carried the full response. | Render and finish. |
errored | The agent failed mid-turn; the events received so far are all there is. The turn is terminal — not resumable. | Surface the failure. To try again, send a new turn (a fresh POST /chat/stream), not resume=true. |
dead | The server instance producing the turn stopped mid-flight; the turn may be recoverable. | Resume the turn (stateful) or resend the request (stateless). |
gone | The turn finished and its recording was superseded or expired — it cannot be replayed from here. | Refetch the conversation (GET /conversations/{session_id}) to read the final answer. Do not resume. |
Streams served live — the original POST, or a GET that attaches to the live buffer — do not emit stream_status: the connection closing after the final new_message already means the turn is done.
A 404 from GET /chat/stream/{message_id} carries the same meaning as dead when the turn was recently in flight: the turn cannot be reattached because the producing instance died, the recording expired, or the message_id is unknown. A 410 carries the same meaning as gone: the turn finished and its slot was superseded or reaped — refetch the conversation (GET /conversations/{session_id}) instead of reconnecting, and never resume.
Resuming an Interrupted Turn
In a stateful session, a turn whose producer died — stream_status with reason dead, or 404 on reconnect — can be resumed rather than restarted. POST to the streaming endpoint with resume=true and the turn's message_id in the query, and the session in the body:
POST /chat/stream?tenant={tenant}&resume=true&message_id={message_id}
{
"agent_identifier": "custom_agent",
"stateful": true,
"session_id": "the-session-id",
"conversation": [{ "sender": "user", "content": "the message that started the turn" }]
}
The service takes the turn over under the same message_id — fencing the previous producer in case it is still running — and re-drives it from the session's last checkpoint: agent work completed before the crash (tool calls, retrieved context, generated text) is restored rather than recomputed, and the body conversation is ignored. If the turn crashed before the first checkpoint was saved, the service falls back to running a fresh turn with the messages from the body — so include the user message that started the turn.
No. When a checkpoint exists, the body conversation is dropped — the user turn is already recorded in the checkpoint, and the agent continues from there. The conversation is used only as a seed when the turn died before its first checkpoint, and in that case nothing was persisted yet, so there is still exactly one copy. You send it because the client cannot know whether a checkpoint was saved before the crash; the service decides, and it never appends the message twice.
The response is a normal SSE stream under the same message_id, so the reconnection and cancellation endpoints keep working unchanged.
import os
import requests
TENANT = "zetaalpha"
BASE_URL = "https://api.zeta-alpha.com/v0/service/chat/stream"
response = requests.post(
f"{BASE_URL}?tenant={TENANT}&resume=true&message_id={message_id}",
headers={
"accept": "text/event-stream",
"Content-Type": "application/json",
"X-Auth": os.getenv("ZETA_ALPHA_API_KEY"),
},
json={
"agent_identifier": "custom_agent",
"stateful": True,
"session_id": session_id,
"conversation": [
{"sender": "user", "content": "What is RAG?"}
],
},
stream=True,
)
In stateless mode there is nothing to resume from — the service holds no state for the conversation. Resend the original request instead; it runs as a new turn with a new message_id.
Cancellation
To cancel a running stream before the agent finishes:
DELETE /chat/stream/{message_id}?tenant={tenant}
This stops the agent's generation and cleans up server-side resources. The endpoint returns 204 No Content on success.
import os
import requests
TENANT = "zetaalpha"
message_id = "the-message-id-from-stream"
requests.delete(
f"https://api.zeta-alpha.com/v0/service/chat/stream/{message_id}?tenant={TENANT}",
headers={"X-Auth": os.getenv("ZETA_ALPHA_API_KEY")},
)
Turn lifetime
A turn runs to completion on the server even if no client is connected — disconnecting or closing the tab does not cancel it. Reconnect at any time (see Reconnection, or GET /chat/stream/status to discover the conversation's current turn) to keep watching. To stop a turn before it finishes, use the cancellation endpoint.
Error Handling
If the connection serving the stream fails, an error event is sent:
event: error
data: Internal streaming error
Clients should handle this event by closing the connection and reconnecting via GET /chat/stream/{message_id} — the turn may still be producing on the server. Note the difference from stream_status: error reports a failure of this connection, while stream_status reports the outcome of the turn itself.