How to Create Agents that Use Tools
In this guide, we will explore how to create agents that can use external tools or services to enhance their functionality. This is done by declaring the tools' specifications and prompting the LLM to use them in certain situations. In some cases, a tool may be a simple stateless function that can be added to the agent's logic directly. In other cases, a tool may be a complex service that requires its own configuration and initialization. This guide will cover both scenarios.
Prerequisites
Before you begin, make sure you have completed the Getting Started with the Agents SDK tutorial and you are familiar with How to Configure Agents and How to Create Injectable Dependencies.
Step 1: Define the Tool Executor
First, you need to define the tool executor. This can be any python executable code that can be called by the agent to perform a specific task. The tool can be a simple function or an injectable dependency (for cases that require runtime initialization and configuration).
The following is a simple example of a tool that crawls a URL and returns the HTML content:
- Function
- Dependency
from typing import Dict, Optional
import httpx
async def url_crawler(url: str) -> str:
"""Crawl the given URL and return the HTML content."""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text
For the dependency example we can add a configurable parameter to pass HTTP headers, such as User-Agent, Authorization, etc. Scroll to the end to see how API keys can be passed securely.
from typing import Dict, Optional
import httpx
from zav.agents_sdk import AgentDependencyFactory, AgentDependencyRegistry
class UrlCrawler:
def __init__(self, headers: Optional[Dict[str, str]] = None):
self.headers = headers
async def crawl(self, url: str) -> str:
"""Crawl the given URL and return the HTML content."""
async with httpx.AsyncClient() as client:
response = await client.get(url, headers=self.headers)
return response.text
class UrlCrawlerFactory(AgentDependencyFactory):
@classmethod
def create(cls, crawler_headers: Optional[Dict[str, str]] = None) -> UrlCrawler:
return UrlCrawler(headers=crawler_headers)
AgentDependencyRegistry.register(UrlCrawlerFactory)
Step 3: Prompting the LLM to Use the Tool
The way to prompt the LLM to use the tool depends on the framework you are using. In this example, we will use the built-in completion client since it's the simplest to use. Refer to the Building and Running Your First RAG Agent tutorial for an introduction to this completion client.
Both tool executor options have an almost identical usage pattern. The only difference is origin of the tool. In the function example, the tool is a simple function that can be added to the agent's logic directly. In the dependency example, the tool is an injectable dependency that will appear in the agent's constructor.
- Function
- Dependency
from typing import AsyncGenerator, List
from zav.agents_sdk import ChatAgentFactory, ChatMessage, StreamableChatAgent
from zav.agents_sdk.adapters import ZAVChatCompletionClient
from typing import Dict, Optional
import httpx
async def url_crawler(url: str) -> str:
"""Crawl the given URL and return the HTML content."""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text
# Agent Code
@ChatAgentFactory.register()
class ChatAgent(StreamableChatAgent):
agent_name = "chat_agent"
def __init__(
self, client: ZAVChatCompletionClient
):
self.client = client
self.tools_registry.add(
url_crawler,
description="""Zeta Alpha Docs crawler. To get to the sitemap of the \
documentation you can crawl https://docs.zeta-alpha.com/sitemap.xml.
From there you can navigate to the correct page as needed.""",
)
async def execute_streaming(
self, conversation: List[ChatMessage]
) -> AsyncGenerator[ChatMessage, None]:
response = await self.client.complete(
bot_setup_description="""You are an agent that can crawl websites \
and navigate to the right location to crawl the correct page to answer \
the user's question.""",
messages=conversation,
tools=self.tools_registry,
max_tokens=2048,
stream=True,
execute_tools=True,
)
async for chat_client_response in response:
if chat_client_response.error is not None:
raise chat_client_response.error
if chat_client_response.chat_completion is None:
raise Exception("No response from chat completion client")
yield ChatMessage.from_orm(chat_client_response.chat_completion)
from typing import AsyncGenerator, List
from zav.agents_sdk import ChatAgentFactory, ChatMessage, StreamableChatAgent
from zav.agents_sdk.adapters import ZAVChatCompletionClient
from typing import Dict, Optional
import httpx
from zav.agents_sdk import AgentDependencyFactory, AgentDependencyRegistry
class UrlCrawler:
def __init__(self, headers: Optional[Dict[str, str]] = None):
self.headers = headers
async def crawl(self, url: str) -> str:
"""Crawl the given URL and return the HTML content."""
async with httpx.AsyncClient() as client:
response = await client.get(url, headers=self.headers)
return response.text
class UrlCrawlerFactory(AgentDependencyFactory):
@classmethod
def create(cls, crawler_headers: Optional[Dict[str, str]] = None) -> UrlCrawler:
return UrlCrawler(headers=crawler_headers)
AgentDependencyRegistry.register(UrlCrawlerFactory)
# Agent Code
@ChatAgentFactory.register()
class ChatAgent(StreamableChatAgent):
agent_name = "chat_agent"
def __init__(self, client: ZAVChatCompletionClient, url_crawler: UrlCrawler):
self.client = client
self.tools_registry.add(
url_crawler.crawl,
description="""Zeta Alpha Docs crawler. To get to the sitemap of the \
documentation you can crawl https://docs.zeta-alpha.com/sitemap.xml.
From there you can navigate to the correct page as needed.""",
)
async def execute_streaming(
self, conversation: List[ChatMessage]
) -> AsyncGenerator[ChatMessage, None]:
response = await self.client.complete(
bot_setup_description="""You are an agent that can crawl websites \
and navigate to the right location to crawl the correct page to answer \
the user's question.""",
messages=conversation,
tools=self.tools_registry,
max_tokens=2048,
stream=True,
execute_tools=True,
)
async for chat_client_response in response:
if chat_client_response.error is not None:
raise chat_client_response.error
if chat_client_response.chat_completion is None:
raise Exception("No response from chat completion client")
yield ChatMessage.from_orm(chat_client_response.chat_completion)
The ChatAgent
class provides a ToolsRegistry
object for registering the tools that the LLM can use. The add
method of the ToolsRegistry
can infer the tool's signature and description. In this case we override the description to provide more context about how we want the LLM to use the tool.
In the execute_streaming
method, we pass a system prompt via the bot_setup_description
parameter. This prompt is used to dictate the behavior of the LLM when interacting with the user. We then pass the ToolsRegistry
into the tools
parameter and set the execute_tools
flag to True
. This will automatically handle the tool execution flow for you. Therefore the next response from the completion client will contain the final answer of the LLM, after having seen the tool's output.
In this modality, the LLM is fully responsible for deciding if a tool needs to be executed and when. If you want more control, you can check out the ZAVChatCompletionClient reference.
Step 4: Configuring the Tool
When using a dependency as a tool, you may need to add extra configuration from the agent setup. Refer to the How to Configure Agents guide for more information on how to configure agents.
Here is an example of an dev/agent_setups.json
file with the secret credentials for the UrlCrawler
:
{
"search_agent": {
"agent_identifier": "search_agent",
"agent_name": "search_agent",
"agent_configuration": {
"crawler_headers": {
"Authorization": "Bearer <api key>"
}
}
}
}
Since the api key is sensitive information, it is stored in the dev/agent_setups.json
file. The SDK will merge this configuration with the agent setup from agent_setups.json
to create the final agent configuration. The SDK will pass this configuration to the create
method of the WebSearchToolFactory
. when the agent is initialized.