Agents SDK API Reference
This reference provides comprehensive documentation of the Agents SDK's API, including base classes, methods, parameters, domain models, and built-in implementations.
Base Classes
ChatAgent
The ChatAgent
class is the main interface that developers need to implement in order to build an agent. It has a single abstract method execute
that takes a conversation (list of ChatMessage
) and returns a response (single ChatMessage
).
Attributes
- agent_name: A class variable that should be set to the name of the agent.
- debug_backend: An optional callable that can be used for debugging.
- tool_registry: An instance of the
ToolsRegistry
class that can be used for registering and retrieving tools.
Methods
debug(msg: Any) -> Any
: A method for logging debug messages. When called, it will send the message to thedebug_backend
if it is set.execute(conversation: List[ChatMessage]) -> Optional[ChatMessage]
: This abstract method must be implemented by the agent. It takes a list ofChatMessage
objects representing the conversation and returns a singleChatMessage
object as the response. The code can use any dependencies injected into the__init__
method of the concrete agent class.
Detailed information on what arguments are passed to the __init__
method of the concrete agent class can be found in the How to Configure Agents guide.
StreamableChatAgent
The StreamableChatAgent
class extends ChatAgent
and adds support for streaming responses. It has an additional abstract method execute_streaming
that returns an asynchronous generator of ChatMessage
objects.
Methods
execute_streaming(conversation: List[ChatMessage]) -> AsyncGenerator[ChatMessage, None]
: This abstract method must be implemented by the agent. It takes a list ofChatMessage
objects representing the conversation and returns an asynchronous generator ofChatMessage
objects as the response. The code can use any dependencies injected into the__init__
method of the concrete agent class.
The StreamableChatAgent
class includes an implementation of the execute
method that internally calls execute_streaming
and collects the results into a list before returning them. This allows agents that only need to implement execute_streaming
to work as non-streaming agents as well.
AgentDependencyFactory
The AgentDependencyFactory
class is a base class for factories that create injectable dependencies for agents. It provides a create
method that should be implemented by subclasses to create the required dependency object.
Methods
create(*args, **kwargs) -> T
: This abstract method must be implemented by subclasses to create the required dependency object based on the provided configuration parameters. The return typeT
should be the type of the dependency object. The method can take any number of positional and keyword arguments as needed.
Detailed information on what arguments are passed to the create
method can be found in How to Create Injectable Dependencies guide.
Class Registration
ChatAgentFactory
The ChatAgentFactory
class is a factory for registering implementations of ChatAgent
or StreamableChatAgent
classes and creating instances of them based on configuration parameters. The create
method is called inside the REST API handlers to create the required agent instance.
Methods
register()
: A decorator for registering agent class implementations with the factory.
All agent classes must be registered with the factory using the register
decorator in order for it to be available in the REST API. For example:
@ChatAgentFactory.register()
class MyAgent(ChatAgent):
...
AgentDependencyRegistry
The AgentDependencyRegistry
class is used for registering implementations of AgentDependencyFactory
classes or instances via its register
method. It is used by the dependency injection system to create and inject dependencies into agent instances.
Methods
register(inst_or_cls: Union[Type[AgentDependencyFactory], AgentDependencyFactory])
: A method for registering implementations ofAgentDependencyFactory
classes or instances. It takes either a class or an instance of a factory and registers it for creating dependencies.
Example usage:
class MyDependencyFactory(AgentDependencyFactory):
def create(self, param1: str, param2: int) -> MyDependency:
return MyDependency(param1, param2)
AgentDependencyRegistry.register(MyDependencyFactory)
Domain Models
ToolsRegistry
The ToolsRegistry
class is used for registering and managing tools (functions) that can be used by agents and completion clients. This class provides methods to add tools and retrieve them by name.
tools_index
: A dictionary that stores the registered tools. The keys are the names of the tools, and the values are theTool
objects.add(executable: Callable, name: Optional[str] = None, description: Optional[str] = None)
: Registers a tool with the registry.executable
: The callable (function) to register as a tool.name
: An optional name for the tool. If not provided, the qualified name of the callable will be used.description
: An optional description for the tool. If not provided, the docstring of the callable will be used.
Example usage:
def my_tool(param1: str, param2: int) -> str:
"""
This is an example tool that takes two parameters and returns a string.
"""
return f"param1: {param1}, param2: {param2}"
tools_registry = ToolsRegistry()
tools_registry.add(executable=my_tool, name="example_tool", description="An example tool.")
Tool
The Tool
class represents a tool that can be used by agents and completion clients. It contains the following fields:
name
: The name of the tool.description
: A description of the tool.executable
: The callable (function) that implements the tool.get_parameters_spec() -> Dict[str, Any]
: Returns a JSON schema of the parameters of the tool. The schema is extracted from the parameters and type hints of theexecutable
function. It supports primitive types as well as PydanticBaseModel
subclasses.
ChatMessage
The ChatMessage
class represents a message in a conversation. It contains the following fields:
- sender: The sender of the message (
ChatMessageSender
). - content: The content of the message.
- image_uri: An optional URI of an image associated with the message. This requires the agent to use a multimodal model.
- evidences: An optional list of
ChatMessageEvidence
objects. Agents can use this field to provide evidence for their responses. - function_call_request: An optional
FunctionCallRequest
object. Some agents may need the client to call a function on their behalf. This object can be used to specify the function to call and its parameters. - function_specs: An optional
FunctionSpec
object. It contains the specification of the function call returned infunction_call_request
.
ChatMessageSender
The ChatMessageSender
enum represents the sender of a message. It has the following values:
- USER: The message was sent by the user.
- BOT: The message was sent by the bot.
ChatMessageEvidence
The ChatMessageEvidence
class represents evidence associated with a message. It contains the following fields:
- document_hit_url: The URL of the document that contains the evidence. Calling this URL should return the document or chunk along with its metadata, content, bounding boxes (for the case of document chunks), etc.
- text_extract: An optional text extract from the document that contains the evidence.
- anchor_text: An optional anchor text that can be used to link the evidence to the agent message.
FunctionCallRequest
The FunctionCallRequest
class represents a request to call a function. It contains the following fields:
- name: The name of the function to call.
- params: An optional dictionary of parameters to pass to the function.
FunctionSpec
The FunctionSpec
class represents the specification of a function. The parameters of the function are described using the OpenAPI 3.0 schema. It contains the following fields:
- name: The name of the function.
- description: A description of the function.
- parameters: A dictionary of parameters for the function.
The following is an example of a FunctionSpec
object:
{
"name": "get_weather",
"description": "Get the current weather for a location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location for which to get the weather.",
},
},
"required": ["location"],
},
}
AgentSetup
The AgentSetup
class represents the setup configuration for an agent. During local development, this is read from a JSON file. For more details on this file and its relation to the AgentSetup
class, see the How to Configure Agents guide. The class contains the following fields:
- agent_identifier: A unique identifier for the agent. This is the value that is used in the API to select the agent.
- agent_name: The name of the agent. This should match the
agent_name
attribute in theChatAgent
class. - llm_client_configuration: An optional configuration for the language model client used by the agent (
LLMClientConfiguration
). - agent_configuration: An optional custom configuration for the agent. This can include any key-value pairs that the agent needs.
- sub_agent_mapping: An optional mapping of sub-agent names to sub-agent identifiers. This is used to assign different configurations to the nested agents.
LLMClientConfiguration
The LLMClientConfiguration
class represents the configuration for the language model client used by an agent. It contains the following fields:
- vendor: The vendor of the language model (
LLMProviderName
). - vendor_configuration: Vendor-specific configuration options (
LLMVendorConfiguration
). - model_configuration: Configuration for the language model (
LLMModelConfiguration
).
LLMProviderName
The LLMProviderName
enum represents the vendor of the language model. It has the following values:
- OPENAI: The language model is provided by OpenAI.
- ANTHROPIC: The language model is provided by Anthropic.
LLMVendorConfiguration
The LLMVendorConfiguration
class represents vendor-specific configuration selectors for the language model. It mirrors LLMProviderName
and contains the following fields:
- openai: An optional configuration for OpenAI (
OpenAIConfiguration
). - anthropic: An optional configuration for Anthropic (
AnthropicConfiguration
).
OpenAIConfiguration
The OpenAIConfiguration
class represents the configuration for the OpenAI language model. It contains the following fields:
- openai_api_key: The API key for the OpenAI language model (
EncryptedStr
). - openai_org: The organization ID for the OpenAI language model (
EncryptedStr
). - openai_api_type: The type of the OpenAI API (optional).
- openai_api_base: The base URL for the OpenAI API (optional).
- openai_api_version: The version of the OpenAI API (optional).
AnthropicConfiguration
The AnthropicConfiguration
class represents the configuration for the Anthropic language model. It contains the following fields:
- anthropic_api_key: The API key for the Anthropic language model (
EncryptedStr
). - anthropic_api_type: The type of the Anthropic API (optional). For example, "bedrock" to use model via AWS.
- anthropic_api_base: The base URL for the Anthropic API (optional).
- aws_secret_key: The AWS secret key for the Anthropic language model (
EncryptedStr
). This is required ifanthropic_api_type
is set to "bedrock". - aws_access_key: The AWS access key for the Anthropic language model (
EncryptedStr
). This is required ifanthropic_api_type
is set to "bedrock". - aws_region: The AWS region for the Anthropic language model (optional). This is required if
anthropic_api_type
is set to "bedrock".
LLMModelConfiguration
The LLMModelConfiguration
class represents the configuration for the language model. It contains the following fields:
- name: The name of the language model.
- type: The type of the language model (
LLMModelType
). - temperature: The temperature for sampling from the language model.
- json_output: Whether to output the response in JSON format.
- max_tokens: The maximum number of tokens to generate in the response.
LLMModelType
The LLMModelType
enum represents the type of the language model. It has the following values:
- CHAT: A chat-based language model.
- PROMPT: A language model that generates the completion of a prompt.
- PROMPT_WITH_LOGITS: A language model that generates the completion of a prompt with logits.
EncryptedStr
The EncryptedStr
class is a mechanism for preventing accidental leakage of secrets through logs or exception dumps. It is a subclass of str
that checks if the string is an encrypted cipher text or a plain secret text. If it is a cipher text, then the string object is initialized with it. This is normally used when storing secrets in a database. If the string is a plain text secret, then the string object is initialized with an empty value and secret is stored as a private value.
Once the EncryptedStr
object reaches its destination where the plain secret is needed, then the get_unencrypted_secret
method needs to be called. This method will return the plain secret text.
Built-in Implementations
ZAVRetriever
The ZAVRetriever
class is a wrapper of the Zeta Alpha Search API. It provides methods for searching and retrieving documents.
Methods
search(...) -> Dict
: Searches for documents based on the provided parameters. The full parameter list documentation can be found in the Document Search API documentation. A simple example of its usage is in the Building and Running Your First RAG Agent tutorial.retrieve(document_hit_url: str) -> Optional[Dict]
: Retrieves a document based on the provided document hit URL. The document hit URL is one of the parameters returned by thesearch
method. It is also returned by the agent in theChatMessageEvidence
object.list(retrieval_unit: Literal["document", "chunk"] = "document", property_name: str = "id", property_values: List[str] = [], index_id: Optional[str] = None) -> Dict
: Lists documents or chunks based on the provided parameters. For example to find a chunk by its ID, you could uselist(retrieval_unit="chunk", property_name="id", property_values=["the_chunk_id"])
.
ZAVLangchainStore
The ZAVLangchainStore
class implements a LangChain document store that provides a LangChain document retriever ZAVLangchainRetriever
.
Methods
as_retriever(...) -> ZAVLangchainRetriever
: The parameters are the same as thesearch
method of theZAVRetriever
class, except for thequery_string
parameter. Thequery_string
parameter is instead passed into theZAVLangchainRetriever
via the usual LangChain retriever methods.
Extra arguments passed to as_retriever
are propagated to LangChain's BaseRetriever
class.
ZAVLangchainRetriever
The ZAVLangchainRetriever
class is an implementation of LangChain's BaseRetriever
class. It provides the ability to retrieve content from the Zeta Alpha platform. Refer to LangChain's documentation for more information on how to use the BaseRetriever
class.
ZAVChatCompletionClient
The ZAVChatCompletionClient
class is an implementation of a chat completion client that interacts with the Zeta Alpha API. It provides methods for generating chat completions.
Methods
- complete(messages: Optional[List[ChatMessage]] = None, completions: Optional[List[ChatCompletion]] = None, max_tokens: int = 2048, bot_setup_description: Optional[str] = None, functions: Optional[List[Dict]] = None, tools: Optional[List[Dict]] = None, tool_choice: Optional[str] = None, stream: Literal[True] = True) -> Union[AsyncIterator[ChatResponse], ChatResponse]: Generates chat completions based on the provided parameters.
messages
: A list of chat messages that have been exchanged between the user and the bot. This is the same type as theconversation
parameter in theChatAgent.execute
method.completions
: The history of completions that have been exchanged between the agent and the underlying language model. You should only use one ofmessages
orcompletions
at a time.completions
has a type that is closer to what the language model expects. Sometimes, you may want the agent to have an "internal" conversation with the language model, and in that case, you can usecompletions
. For example, you may want the language model to decide which internal function the agent should execute next, without communicating this to the user.max_tokens
: The maximum number of tokens that the language model should generate.bot_setup_description
: This is mapped to the "system message" or equivalent in the language model. It is a message that is sent to the language model at the beginning of the conversation. It can be used to set up the language model's internal state and behavior.functions
: A list of functions that the language model could choose for the agent to execute. The schema of the dictionary is the same as theFunctionSpec
type. You should only pass one offunctions
ortools
at a time.tools
: An instance ofToolsRegistry
which informs the LLM what tools it can use. The LLM will decide which tools to use based on the context. Under the hood, theToolsRegistry
converts the tools into a list of schemas that the LLM understands. You can also pass the tools directly as a list of dictionaries. The schema of each element is{
"type": str, # "function" is the only supported type
"function": FunctionSpec
}tool_choice
: This parameter is used to control how the language model chooses between the tools that are provided in thetools
parameter. The possible values are:"auto"
: The language model will automatically choose which tool to use."required"
: The language model will always choose one of the tools provided in thetools
parameter."none"
: The language model will not choose any tool and will only generate a user-facing message.{"type": "function", "function": {"name": "<tool name>"}}
: The language model will always choose the tool with the specified name"<tool name>"
.
stream
: IfTrue
, the method returns an asynchronous iterator that yieldsChatResponse
objects. Otherwise, the method returns a singleChatResponse
object.execute_tools
: (Only used whenToolsRegistry
is passed totools
) A boolean flag that indicates whether the client should execute the LLM-selected tools right away. After the tools are executed, the LLM will be called again with the tools' outputs and finally return the response to the agent. The default value isFalse
. When disabled, the agent is responsible for executing the tools and returning the response back to the client.
When streaming is enabled, the ChatResponse
object contains the accumulated completions generated by the language model. The agent code can use these directly without having to implement the accumulation logic.
Types
ChatCompletion
A data class that represents a completion that the language model has generated. It has the following attributes:
sender: ChatCompletionSender
: The sender of the completion. The possible values are:"user"
: The user has sent the completion."bot"
: The bot has sent the completion."function"
: The completion is a function that the agent has executed."tool"
: The completion is a tool that the agent has used.
content: str
: The content of the completion.image_url: Optional[str]
: The Data URL of an image that is associated with the completion.function_call_request: Optional[FunctionCallRequest]
: The function call that the language model has chosen for the agent to execute.function_call_response: Optional[FunctionCallResponse]
: The function call response that the agent has received after executing the function call request.tool_call_requests: Optional[List[ToolCallRequest]]
: The tool calls that the language model has chosen for the agent to use.tool_call_responses: Optional[List[ToolCallResponse]]
: The tool call responses that the agent has received after using the tools.
FunctionCallResponse
A data class that represents a function call response. It has the following attributes:
name: str
: The name of the function that was called.response: Optional[str]
: The response of the function call. This typically a stringified JSON object.
ToolCallRequest
A data class that represents a tool call request. It has the following attributes:
id: str
: The ID of the tool.function_call_request: FunctionCallRequest
: The function call that the agent should execute using the tool.
ToolCallResponse
A data class that represents a tool call response. It has the following attributes:
id: str
: The ID of the tool.response: Optional[str]
: The response of the tool call. This typically a stringified JSON object.
ChatResponse
A data class that represents the response of the complete
method. It has the following attributes:
error: Optional[Exception]
: An exception that occurred during the completion generation process.chat_completion: Optional[ChatCompletion]
: The completion that the language model has generated.