Skip to main content

How to Create Agents that Use Tools

In this guide, we will explore how to create agents that can use external tools or services to enhance their functionality. This is done by declaring the tools' specifications and prompting the LLM to use them in certain situations. In some cases, a tool may be a simple stateless function that can be added to the agent's logic directly. In other cases, a tool may be a complex service that requires its own configuration and initialization. This guide will cover both scenarios.

Prerequisites

Before you begin, make sure you have completed the Getting Started with the Agents SDK tutorial and you are familiar with How to Configure Agents and How to Create Injectable Dependencies.

Step 1: Define the Tool Executor

First, you need to define the tool executor. This can be any python executable code that can be called by the agent to perform a specific task. The tool can be a simple function or an injectable dependency (for cases that require runtime initialization and configuration).

The following is a simple example of a tool that crawls a URL and returns the HTML content:

from typing import Dict, Optional
import httpx

async def url_crawler(url: str) -> str:
"""Crawl the given URL and return the HTML content."""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text

Step 3: Prompting the LLM to Use the Tool

The way to prompt the LLM to use the tool depends on the framework you are using. In this example, we will use the built-in completion client since it's the simplest to use. Refer to the Building and Running Your First RAG Agent tutorial for an introduction to this completion client.

Both tool executor options have an almost identical usage pattern. The only difference is origin of the tool. In the function example, the tool is a simple function that can be added to the agent's logic directly. In the dependency example, the tool is an injectable dependency that will appear in the agent's constructor.

from typing import AsyncGenerator, List

from zav.agents_sdk import ChatAgentFactory, ChatMessage, StreamableChatAgent
from zav.agents_sdk.adapters import ZAVChatCompletionClient
from typing import Dict, Optional
import httpx

async def url_crawler(url: str) -> str:
"""Crawl the given URL and return the HTML content."""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text

# Agent Code
@ChatAgentFactory.register()
class ChatAgent(StreamableChatAgent):
agent_name = "chat_agent"

def __init__(
self, client: ZAVChatCompletionClient
):
self.client = client
self.tools_registry.add(
url_crawler,
description="""Zeta Alpha Docs crawler. To get to the sitemap of the \
documentation you can crawl https://docs.zeta-alpha.com/sitemap.xml.
From there you can navigate to the correct page as needed.""",
)

async def execute_streaming(
self, conversation: List[ChatMessage]
) -> AsyncGenerator[ChatMessage, None]:
response = await self.client.complete(
bot_setup_description="""You are an agent that can crawl websites \
and navigate to the right location to crawl the correct page to answer \
the user's question.""",
messages=conversation,
tools=self.tools_registry,
max_tokens=2048,
stream=True,
execute_tools=True,
)
async for chat_client_response in response:
if chat_client_response.error is not None:
raise chat_client_response.error
if chat_client_response.chat_completion is None:
raise Exception("No response from chat completion client")

yield ChatMessage.from_orm(chat_client_response.chat_completion)

The ChatAgent class provides a ToolsRegistry object for registering the tools that the LLM can use. The add method of the ToolsRegistry can infer the tool's signature and description. In this case we override the description to provide more context about how we want the LLM to use the tool.

In the execute_streaming method, we pass a system prompt via the bot_setup_description parameter. This prompt is used to dictate the behavior of the LLM when interacting with the user. We then pass the ToolsRegistry into the tools parameter and set the execute_tools flag to True. This will automatically handle the tool execution flow for you. Therefore the next response from the completion client will contain the final answer of the LLM, after having seen the tool's output.

info

In this modality, the LLM is fully responsible for deciding if a tool needs to be executed and when. If you want more control, you can check out the ZAVChatCompletionClient reference.

Step 4: Configuring the Tool

When using a dependency as a tool, you may need to add extra configuration from the agent setup. Refer to the How to Configure Agents guide for more information on how to configure agents.

Here is an example of an dev/agent_setups.json file with the secret credentials for the UrlCrawler:

{
"search_agent": {
"agent_identifier": "search_agent",
"agent_name": "search_agent",
"agent_configuration": {
"crawler_headers": {
"Authorization": "Bearer <api key>"
}
}
}
}

Since the api key is sensitive information, it is stored in the dev/agent_setups.json file. The SDK will merge this configuration with the agent setup from agent_setups.json to create the final agent configuration. The SDK will pass this configuration to the create method of the WebSearchToolFactory. when the agent is initialized.