Agents SDK API Reference
This reference provides comprehensive documentation of the Agents SDK's API, including base classes, methods, parameters, domain models, and built-in implementations.
Base Classes
ChatAgent
The ChatAgent class is the main interface that developers need to implement in order to build an agent. It has a single abstract method execute that takes a conversation (list of ChatMessage) and returns a response (single ChatMessage).
Attributes
- agent_name: A class variable that should be set to the name of the agent.
- debug_backend: An optional callable that can be used for debugging.
- tool_registry: An instance of the
ToolsRegistryclass that can be used for registering and retrieving tools.
Methods
debug(msg: Any) -> Any: A method for logging debug messages. When called, it will send the message to thedebug_backendif it is set.execute(conversation: List[ChatMessage]) -> Optional[ChatMessage]: This abstract method must be implemented by the agent. It takes a list ofChatMessageobjects representing the conversation and returns a singleChatMessageobject as the response. The code can use any dependencies injected into the__init__method of the concrete agent class.
Detailed information on what arguments are passed to the __init__ method of the concrete agent class can be found in the How to Configure Agents guide.
StreamableChatAgent
The StreamableChatAgent class extends ChatAgent and adds support for streaming responses. It has an additional abstract method execute_streaming that returns an asynchronous generator of ChatMessage objects.
Methods
execute_streaming(conversation: List[ChatMessage]) -> AsyncGenerator[ChatMessage, None]: This abstract method must be implemented by the agent. It takes a list ofChatMessageobjects representing the conversation and returns an asynchronous generator ofChatMessageobjects as the response. The code can use any dependencies injected into the__init__method of the concrete agent class.
The StreamableChatAgent class includes an implementation of the execute method that internally calls execute_streaming and collects the results into a list before returning them. This allows agents that only need to implement execute_streaming to work as non-streaming agents as well.
AgentDependencyFactory
The AgentDependencyFactory class is a base class for factories that create injectable dependencies for agents. It provides a create method that should be implemented by subclasses to create the required dependency object.
Methods
create(*args, **kwargs) -> T: This abstract method must be implemented by subclasses to create the required dependency object based on the provided configuration parameters. The return typeTshould be the type of the dependency object. The method can take any number of positional and keyword arguments as needed.
Detailed information on what arguments are passed to the create method can be found in How to Create Injectable Dependencies guide.
Class Registration
ChatAgentClassRegistry
The ChatAgentClassRegistry class is a factory for registering implementations of ChatAgent or StreamableChatAgent classes and creating instances of them based on configuration parameters. The create method is called inside the REST API handlers to create the required agent instance.
Methods
register(): A decorator for registering agent class implementations with the factory.
All agent classes must be registered with the factory using the register decorator in order for it to be available in the REST API. For example:
@ChatAgentClassRegistry.register()
class MyAgent(ChatAgent):
...
AgentDependencyRegistry
The AgentDependencyRegistry class is used for registering implementations of AgentDependencyFactory classes or instances via its register method. It is used by the dependency injection system to create and inject dependencies into agent instances.
Methods
register(inst_or_cls: Union[Type[AgentDependencyFactory], AgentDependencyFactory]): A method for registering implementations ofAgentDependencyFactoryclasses or instances. It takes either a class or an instance of a factory and registers it for creating dependencies.
Example usage:
class MyDependencyFactory(AgentDependencyFactory):
def create(self, param1: str, param2: int) -> MyDependency:
return MyDependency(param1, param2)
AgentDependencyRegistry.register(MyDependencyFactory)
Domain Models
ToolsRegistry
The ToolsRegistry class is used for registering and managing tools (functions) that can be used by agents and completion clients. This class provides methods to add tools and retrieve them by name.
tools_index: A dictionary that stores the registered tools. The keys are the names of the tools, and the values are theToolobjects.add(executable: Callable, name: Optional[str] = None, description: Optional[str] = None): Registers a tool with the registry.executable: The callable (function) to register as a tool.name: An optional name for the tool. If not provided, the qualified name of the callable will be used.description: An optional description for the tool. If not provided, the docstring of the callable will be used.
Example usage:
def my_tool(param1: str, param2: int) -> str:
"""
This is an example tool that takes two parameters and returns a string.
"""
return f"param1: {param1}, param2: {param2}"
tools_registry = ToolsRegistry()
tools_registry.add(executable=my_tool, name="example_tool", description="An example tool.")
Tool
The Tool class represents a tool that can be used by agents and completion clients. It contains the following fields:
name: The name of the tool.description: A description of the tool.executable: The callable (function) that implements the tool.get_parameters_spec() -> Dict[str, Any]: Returns a JSON schema of the parameters of the tool. The schema is extracted from the parameters and type hints of theexecutablefunction. It supports primitive types as well as PydanticBaseModelsubclasses.
ChatMessage
The ChatMessage class represents a message in a conversation. It contains the following fields:
- sender: The sender of the message (
ChatMessageSender). - content: The content of the message.
- image_uri: An optional URI of an image associated with the message. This requires the agent to use a multimodal model.
- evidences: An optional list of
ChatMessageEvidenceobjects. Agents can use this field to provide evidence for their responses. - function_call_request: An optional
FunctionCallRequestobject. Some agents may need the client to call a function on their behalf. This object can be used to specify the function to call and its parameters. - function_specs: An optional
FunctionSpecobject. It contains the specification of the function call returned infunction_call_request.
ChatMessageSender
The ChatMessageSender enum represents the sender of a message. It has the following values:
- USER: The message was sent by the user.
- BOT: The message was sent by the bot.
ChatMessageEvidence
The ChatMessageEvidence class represents evidence associated with a message. It contains the following fields:
- document_hit_url: The URL of the document that contains the evidence. Calling this URL should return the document or chunk along with its metadata, content, bounding boxes (for the case of document chunks), etc.
- text_extract: An optional text extract from the document that contains the evidence.
- anchor_text: An optional anchor text that can be used to link the evidence to the agent message.
FunctionCallRequest
The FunctionCallRequest class represents a request to call a function. It contains the following fields:
- name: The name of the function to call.
- params: An optional dictionary of parameters to pass to the function.
FunctionSpec
The FunctionSpec class represents the specification of a function. The parameters of the function are described using the OpenAPI 3.0 schema. It contains the following fields:
- name: The name of the function.
- description: A description of the function.
- parameters: A dictionary of parameters for the function.
The following is an example of a FunctionSpec object:
{
"name": "get_weather",
"description": "Get the current weather for a location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location for which to get the weather.",
},
},
"required": ["location"],
},
}
AgentSetup
The AgentSetup class represents the setup configuration for an agent. During local development, this is read from a JSON file. For more details on this file and its relation to the AgentSetup class, see the How to Configure Agents guide. The class contains the following fields:
- agent_identifier: A unique identifier for the agent. This is the value that is used in the API to select the agent.
- agent_name: The name of the agent. This should match the
agent_nameattribute in theChatAgentclass. - llm_client_configuration: An optional configuration for the language model client used by the agent (
LLMClientConfiguration). - agent_configuration: An optional custom configuration for the agent. This can include any key-value pairs that the agent needs.
- sub_agent_mapping: An optional mapping of sub-agent names to sub-agent identifiers. This is used to assign different configurations to the nested agents.
LLMClientConfiguration
The LLMClientConfiguration class represents the configuration for the language model client used by an agent. It contains the following fields:
- vendor: The vendor of the language model (
LLMProviderName). - vendor_configuration: Vendor-specific configuration options (
LLMVendorConfiguration). - model_configuration: Configuration for the language model (
LLMModelConfiguration).
LLMProviderName
The LLMProviderName enum represents the vendor of the language model. It has the following values:
- OPENAI: The language model is provided by OpenAI.
- ANTHROPIC: The language model is provided by Anthropic.
LLMVendorConfiguration
The LLMVendorConfiguration class represents vendor-specific configuration selectors for the language model. It mirrors LLMProviderName and contains the following fields:
- openai: An optional configuration for OpenAI (
OpenAIConfiguration). - anthropic: An optional configuration for Anthropic (
AnthropicConfiguration).
OpenAIConfiguration
The OpenAIConfiguration class represents the configuration for the OpenAI language model. It contains the following fields:
- openai_api_key: The API key for the OpenAI language model (
EncryptedStr). - openai_org: The organization ID for the OpenAI language model (
EncryptedStr). - openai_api_type: The type of the OpenAI API (optional).
- openai_api_base: The base URL for the OpenAI API (optional).
- openai_api_version: The version of the OpenAI API (optional).
AnthropicConfiguration
The AnthropicConfiguration class represents the configuration for the Anthropic language model. It contains the following fields:
- anthropic_api_key: The API key for the Anthropic language model (
EncryptedStr). - anthropic_api_type: The type of the Anthropic API (optional). For example, "bedrock" to use model via AWS.
- anthropic_api_base: The base URL for the Anthropic API (optional).
- aws_secret_key: The AWS secret key for the Anthropic language model (
EncryptedStr). This is required ifanthropic_api_typeis set to "bedrock". - aws_access_key: The AWS access key for the Anthropic language model (
EncryptedStr). This is required ifanthropic_api_typeis set to "bedrock". - aws_region: The AWS region for the Anthropic language model (optional). This is required if
anthropic_api_typeis set to "bedrock".
LLMModelConfiguration
The LLMModelConfiguration class represents the configuration for the language model. It contains the following fields:
- name: The name of the language model.
- type: The type of the language model (
LLMModelType). - temperature: The temperature for sampling from the language model.
- json_output: Whether to output the response in JSON format.
- max_tokens: The maximum number of tokens to generate in the response.
LLMModelType
The LLMModelType enum represents the type of the language model. It has the following values:
- CHAT: A chat-based language model.
- PROMPT: A language model that generates the completion of a prompt.
- PROMPT_WITH_LOGITS: A language model that generates the completion of a prompt with logits.
EncryptedStr
The EncryptedStr class is a mechanism for preventing accidental leakage of secrets through logs or exception dumps. It is a subclass of str that checks if the string is an encrypted cipher text or a plain secret text. If it is a cipher text, then the string object is initialized with it. This is normally used when storing secrets in a database. If the string is a plain text secret, then the string object is initialized with an empty value and secret is stored as a private value.
Once the EncryptedStr object reaches its destination where the plain secret is needed, then the get_unencrypted_secret method needs to be called. This method will return the plain secret text.
Built-in Implementations
ZAVRetriever
The ZAVRetriever class is a wrapper of the Zeta Alpha Search API. It provides methods for searching and retrieving documents.
Methods
search(...) -> Dict: Searches for documents based on the provided parameters. The full parameter list documentation can be found in the Document Search API documentation. A simple example of its usage is in the Building and Running Your First RAG Agent tutorial.retrieve(document_hit_url: str) -> Optional[Dict]: Retrieves a document based on the provided document hit URL. The document hit URL is one of the parameters returned by thesearchmethod. It is also returned by the agent in theChatMessageEvidenceobject.list(retrieval_unit: Literal["document", "chunk"] = "document", property_name: str = "id", property_values: List[str] = [], index_id: Optional[str] = None) -> Dict: Lists documents or chunks based on the provided parameters. For example to find a chunk by its ID, you could uselist(retrieval_unit="chunk", property_name="id", property_values=["the_chunk_id"]).
ZAVLangchainStore
The ZAVLangchainStore class implements a LangChain document store that provides a LangChain document retriever ZAVLangchainRetriever.
Methods
as_retriever(...) -> ZAVLangchainRetriever: The parameters are the same as thesearchmethod of theZAVRetrieverclass, except for thequery_stringparameter. Thequery_stringparameter is instead passed into theZAVLangchainRetrievervia the usual LangChain retriever methods.
Extra arguments passed to as_retriever are propagated to LangChain's BaseRetriever class.
ZAVLangchainRetriever
The ZAVLangchainRetriever class is an implementation of LangChain's BaseRetriever class. It provides the ability to retrieve content from the Zeta Alpha platform. Refer to LangChain's documentation for more information on how to use the BaseRetriever class.
ZAVChatCompletionClient
The ZAVChatCompletionClient class is an implementation of a chat completion client that interacts with the Zeta Alpha API. It provides methods for generating chat completions.
Methods
- complete(messages: Optional[List[ChatMessage]] = None, completions: Optional[List[ChatCompletion]] = None, max_tokens: int = 2048, bot_setup_description: Optional[str] = None, functions: Optional[List[Dict]] = None, tools: Optional[List[Dict]] = None, tool_choice: Optional[str] = None, stream: Literal[True] = True) -> Union[AsyncIterator[ChatResponse], ChatResponse]: Generates chat completions based on the provided parameters.
messages: A list of chat messages that have been exchanged between the user and the bot. This is the same type as theconversationparameter in theChatAgent.executemethod.completions: The history of completions that have been exchanged between the agent and the underlying language model. You should only use one ofmessagesorcompletionsat a time.completionshas a type that is closer to what the language model expects. Sometimes, you may want the agent to have an "internal" conversation with the language model, and in that case, you can usecompletions. For example, you may want the language model to decide which internal function the agent should execute next, without communicating this to the user.max_tokens: The maximum number of tokens that the language model should generate.bot_setup_description: This is mapped to the "system message" or equivalent in the language model. It is a message that is sent to the language model at the beginning of the conversation. It can be used to set up the language model's internal state and behavior.functions: A list of functions that the language model could choose for the agent to execute. The schema of the dictionary is the same as theFunctionSpectype. You should only pass one offunctionsortoolsat a time.tools: An instance ofToolsRegistrywhich informs the LLM what tools it can use. The LLM will decide which tools to use based on the context. Under the hood, theToolsRegistryconverts the tools into a list of schemas that the LLM understands. You can also pass the tools directly as a list of dictionaries. The schema of each element is{
"type": str, # "function" is the only supported type
"function": FunctionSpec
}tool_choice: This parameter is used to control how the language model chooses between the tools that are provided in thetoolsparameter. The possible values are:"auto": The language model will automatically choose which tool to use."required": The language model will always choose one of the tools provided in thetoolsparameter."none": The language model will not choose any tool and will only generate a user-facing message.{"type": "function", "function": {"name": "<tool name>"}}: The language model will always choose the tool with the specified name"<tool name>".
stream: IfTrue, the method returns an asynchronous iterator that yieldsChatResponseobjects. Otherwise, the method returns a singleChatResponseobject.execute_tools: (Only used whenToolsRegistryis passed totools) A boolean flag that indicates whether the client should execute the LLM-selected tools right away. After the tools are executed, the LLM will be called again with the tools' outputs and finally return the response to the agent. The default value isFalse. When disabled, the agent is responsible for executing the tools and returning the response back to the client.
When streaming is enabled, the ChatResponse object contains the accumulated completions generated by the language model. The agent code can use these directly without having to implement the accumulation logic.
Types
ChatCompletion
A data class that represents a completion that the language model has generated. It has the following attributes:
sender: ChatCompletionSender: The sender of the completion. The possible values are:"user": The user has sent the completion."bot": The bot has sent the completion."function": The completion is a function that the agent has executed."tool": The completion is a tool that the agent has used.
content: str: The content of the completion.image_url: Optional[str]: The Data URL of an image that is associated with the completion.function_call_request: Optional[FunctionCallRequest]: The function call that the language model has chosen for the agent to execute.function_call_response: Optional[FunctionCallResponse]: The function call response that the agent has received after executing the function call request.tool_call_requests: Optional[List[ToolCallRequest]]: The tool calls that the language model has chosen for the agent to use.tool_call_responses: Optional[List[ToolCallResponse]]: The tool call responses that the agent has received after using the tools.
FunctionCallResponse
A data class that represents a function call response. It has the following attributes:
name: str: The name of the function that was called.response: Optional[str]: The response of the function call. This typically a stringified JSON object.
ToolCallRequest
A data class that represents a tool call request. It has the following attributes:
id: str: The ID of the tool.function_call_request: FunctionCallRequest: The function call that the agent should execute using the tool.
ToolCallResponse
A data class that represents a tool call response. It has the following attributes:
id: str: The ID of the tool.response: Optional[str]: The response of the tool call. This typically a stringified JSON object.
ChatResponse
A data class that represents the response of the complete method. It has the following attributes:
error: Optional[Exception]: An exception that occurred during the completion generation process.chat_completion: Optional[ChatCompletion]: The completion that the language model has generated.