How to Stream Tool Progress

This guide covers how to stream real-time feedback to users when your agent executes tools. Tool streaming displays status messages like "Searching for documents..." that update to "Found 15 results" when complete.

Prerequisites

Before you begin, make sure you are familiar with Creating Custom Sources.

Step 1: Mark Tools as Streamable

Use the @streamable decorator to define display text templates for your tools:

from zav.agents_sdk import streamable

@streamable(
    running_text="Searching for {{ query }}...",
    completed_text="Found {{ result_count }} results for '{{ query }}'"
)
async def search_documents(query: str) -> dict:
    """Search for documents matching the query."""
    results = await perform_search(query)
    return {"result_count": len(results), "documents": results}

Templates use Jinja2 syntax: {{ variable }} for parameters and return values. Nested access is supported (e.g. {{ data.count }}, {{ results[0].title }}).

Alternatively, pass the streaming config explicitly when registering:

from zav.agents_sdk import ToolStreamingConfig

tools_registry.add(
    executable=search_documents,
    streaming_config=ToolStreamingConfig(
        running_text="Searching for {{ query }}...",
        completed_text="Found {{ result_count }} results"
    )
)

Step 2: Enable Tool Progress in Your Agent

Set stream_tool_progress=True in the complete() call and use chat_response.to_chat_message() to convert responses:

from zav.agents_sdk import ChatAgentClassRegistry, ChatMessage, StreamableChatAgent
from zav.agents_sdk.adapters import ZAVChatCompletionClient

@ChatAgentClassRegistry.register()
class SearchAgent(StreamableChatAgent):
    agent_name = "search_agent"

    def __init__(self, client: ZAVChatCompletionClient):
        self.client = client
        self.tools_registry.add(search_documents)

    async def execute_streaming(self, conversation):
        response_stream = await self.client.complete(
            bot_setup_description="You are a helpful search assistant.",
            messages=conversation,
            tools=self.tools_registry,
            stream=True,
            execute_tools=True,
            stream_tool_progress=True,
        )

        async for chat_response in response_stream:
            if chat_response.error:
                raise chat_response.error

            message = chat_response.to_chat_message()
            if message:
                yield message

The to_chat_message() method includes tool progress in content_parts. Each tool progress entry is a ContentPart with type="tool" and a ContentPartTool payload.

name: Tool name
tool_call_id: Unique ID for this tool call
status: "running", "completed", or "error"
display_text: The interpolated template text
params: Tool input parameters
response: Tool output (when completed)

By default, streamed agent responses keep generated text in the message content field and accumulated tool snapshots in content_parts. If your UI needs ordered text and tool entries together in content_parts, pass preserve_streamed_content_parts=True to complete().

note

stream_tool_events=True is still accepted as a deprecated alias for backwards compatibility. New code should use stream_tool_progress=True.

Step 3: Filter Data Sent to Frontend (Optional)

Tool responses may contain internal data or sensitive information not needed by the frontend. Use transform callbacks to filter what gets included:

from zav.agents_sdk import streamable, include_fields, exclude_fields, hide

# Only include specific fields in the response
@streamable(
    running_text="Searching...",
    completed_text="Found {{ result_count }} results",
    response_transform=include_fields("result_count", "summary"),
)
async def search(query: str) -> dict:
    return {
        "result_count": 10,
        "summary": "Found documents about...",
        "internal_scores": [0.9, 0.8],  # Excluded from frontend
    }

# Exclude specific fields
@streamable(
    running_text="Fetching document...",
    response_transform=exclude_fields("raw_content", "metadata"),
)
async def fetch_document(doc_id: str) -> dict:
    ...

# Hide the entire response
@streamable(
    running_text="Processing...",
    response_transform=hide,
)
async def internal_operation(data: str) -> dict:
    ...

You can also write custom transform functions:

def redact_documents(response: dict | None) -> dict | None:
    if response is None:
        return None
    return {
        "result_count": response.get("result_count"),
        "documents": [
            {"id": doc["id"], "title": doc["title"]}
            for doc in response.get("documents", [])
        ]
    }

@streamable(
    running_text="Searching...",
    response_transform=redact_documents,
)
async def search(query: str) -> dict:
    ...

info

params_transform and response_transform only affect params and response in the ContentPartTool sent to the frontend. llm_response_transform only changes what is sent back to the LLM context after tool execution. It does not affect display_text interpolation, which still uses the full tool result.

Step 4: Filter Data Sent Back to the LLM (Optional)

Some tool responses need to stay available to the frontend but should not be sent back to the LLM in full. For example, a tool might return a large raw payload that a custom frontend renderer needs, while the LLM only needs a compact summary. Use llm_response_transform to filter the tool result that is appended to the LLM conversation after execution:

from zav.agents_sdk import streamable, exclude_fields

@streamable(
    running_text="Rendering molecule...",
    completed_text="Rendered molecule",
    # Frontend still receives the full response, including "data".
    llm_response_transform=exclude_fields("data"),
)
async def render_molecule(identifier: str) -> dict:
    return {
        "format": "pdb",
        "data": "large structure block...",
        "summary": "3D structure loaded from PDB",
    }

response_transform and llm_response_transform are independent:

Transform	Affects	Use when
`response_transform`	`ContentPartTool.response` sent to the frontend	The UI does not need the full response, or a field should not be exposed to clients
`llm_response_transform`	Tool result sent back to the LLM context	The UI needs the full response, but the LLM should receive a smaller or redacted version

If llm_response_transform is not set, the LLM receives the full tool result.

MCP Tools

MCP tools discovered at runtime automatically get streaming enabled. To customize the display text, configure tool_streaming in your agent_setups.json:

agent_setups.json
{
  "agent_identifier": "mcp_agent",
  "agent_name": "mcp_agent",
  "agent_configuration": {
    "mcp_tools_provider_configuration": {
      "servers": [...],
      "tool_streaming": {
        "read_file": {
          "running_text": "Reading file {{ path }}...",
          "completed_text": "Loaded file content"
        },
        "list_directory": {
          "running_text": "Listing directory {{ path }}...",
          "completed_text": "Found {{ count }} items"
        }
      }
    }
  }
}

Behavior:

If tool_streaming is not set (default): all MCP tools get auto-generated streaming text
If tool_streaming is set: only tools explicitly listed will stream

Prerequisites​

Step 1: Mark Tools as Streamable​

Step 2: Enable Tool Progress in Your Agent​

Step 3: Filter Data Sent to Frontend (Optional)​

Step 4: Filter Data Sent Back to the LLM (Optional)​

MCP Tools​