Conversations
The Chat API supports two ways to run a multi-turn conversation:
- Stateless (default) — the service keeps nothing between turns. Your client owns the conversation history and sends all of it on every request.
- Stateful — the service persists the agent's working state (transcript, tool results, memory) under a session. Your client sends only the new user message on each turn.
Both modes share the same two identifiers, and both work on the streaming and non-streaming endpoints alike.
Identifiers
| Identifier | Names | Lifetime |
|---|---|---|
session_id | The conversation | Stable across all turns of one conversation |
message_id | One turn | Unique per request |
Every agent response message carries both. The rules:
message_idis always assigned by the service. The request body never carries one — everyPOSTis a new turn. You read themessage_idfrom the response and use it to reconnect to a stream, cancel it, or submit feedback.session_idis assigned on the first turn and echoed by the client afterwards. Omit it on the first request and the service mints one. Send it back on every later request of the same conversation — in either mode. In stateless mode it groups the conversation's turns for tracing and analytics; in stateful mode it is additionally the key under which the session's state is persisted.
Reading the identifiers from a response:
# Non-streaming
agent_message = response.json()["conversation"][-1]
session_id = agent_message["session_id"]
message_id = agent_message["message_id"]
# Streaming: every SSE event's data carries both fields
message = json.loads(event.data)
session_id = message["session_id"]
message_id = message["message_id"]
Stateless Conversations
The default. Each request is self-contained: conversation must carry the full history, with the new user message last. The service derives everything from the request, answers, and forgets.
import os
import requests
TENANT = "zetaalpha"
url = f"https://api.zeta-alpha.com/v0/service/chat/response?tenant={TENANT}"
headers = {
"Content-Type": "application/json",
"X-Auth": os.getenv("ZETA_ALPHA_API_KEY"),
}
# Turn 1
response = requests.post(url, headers=headers, json={
"agent_identifier": "custom_agent",
"conversation": [
{"sender": "user", "content": "What is BERT?"}
],
})
first_answer = response.json()["conversation"][-1]
session_id = first_answer["session_id"]
# Turn 2: full history, plus the session_id from the previous answer
response = requests.post(url, headers=headers, json={
"agent_identifier": "custom_agent",
"session_id": session_id,
"conversation": [
{"sender": "user", "content": "What is BERT?"},
{"sender": "bot", "content": first_answer["content"]},
{"sender": "user", "content": "How does it differ from GPT?"}
],
})
Echoing the session_id in stateless mode is optional but recommended: it costs nothing and keeps the conversation's turns grouped in traces and analytics. It is never used to load or save state — if you forget it, the only consequence is that each turn is traced as its own conversation.
Stateful Sessions
Send "stateful": true and the service persists the agent's working state under the session. After the first turn, the client no longer sends history.
First turn — omit session_id; the service mints one:
response = requests.post(url, headers=headers, json={
"agent_identifier": "custom_agent",
"stateful": True,
"conversation": [
{"sender": "user", "content": "What is BERT?"}
],
})
session_id = response.json()["conversation"][-1]["session_id"]
Follow-up turns — send the session_id and only the new user message:
response = requests.post(url, headers=headers, json={
"agent_identifier": "custom_agent",
"stateful": True,
"session_id": session_id,
"conversation": [
{"sender": "user", "content": "How does it differ from GPT?"}
],
})
The service restores the session's transcript and appends the incoming messages to it. If you send the full history on a follow-up turn, every prior message is duplicated in the model's context — degrading answers and inflating cost.
What Each Side Stores
| Mode | Client keeps | Service keeps |
|---|---|---|
| Stateless | Full conversation history, session_id (optional) | Nothing |
| Stateful | session_id, rendered messages for display | Transcript, tool results, agent working state |
In both modes, keep the message_id of the turn currently in flight if you intend to reconnect to or cancel its stream.
Choosing a Mode
Stateless fits clients that already manage conversation history — it gives full control over what the model sees each turn and requires no recovery logic beyond retrying a request.
Stateful fits long-running agent turns and conversations with heavy tool use: requests stay small, the agent's intermediate work (tool results, retrieved context) survives between turns without round-tripping through the client, and interrupted turns can be resumed from the last checkpoint instead of restarted.
Recovery
When a client reopens a conversation or drops mid-turn, first find out what the conversation's turn is doing with GET /chat/stream/status — see Discovering the current turn, which maps every state to the right action. In short:
- Either mode: reconnect to a live turn's stream with
GET /chat/stream/{message_id}— see Reconnection. - Stateless: if the turn itself was lost (the stream returns 404), resend the request. The history is on your side, so nothing is lost.
- Stateful: if the turn was lost server-side (e.g. the producing pod died —
dead), resume it — the service re-drives the turn from the session's last checkpoint under the samemessage_id. See Resuming an Interrupted Turn. - Terminal turns are not resumable: an
erroredturn failed — retry by sending a new turn; agoneturn already finished — read its answer fromGET /conversations/{session_id}. Neverresume=truefor either.