Skip to main content

Agents SDK API Reference

This reference provides comprehensive documentation of the Agents SDK's API, including base classes, methods, parameters, domain models, and built-in implementations.

Base Classes

ChatAgent

Source

The ChatAgent class is the main interface that developers need to implement in order to build an agent. It has a single abstract method execute that takes a conversation (list of ChatMessage) and returns a response (single ChatMessage).

Attributes

  • agent_name: A class variable that should be set to the name of the agent.
  • debug_backend: An optional callable that can be used for debugging.
  • tool_registry: An instance of the ToolsRegistry class that can be used for registering and retrieving tools.

Methods

  • debug(msg: Any) -> Any: A method for logging debug messages. When called, it will send the message to the debug_backend if it is set.
  • execute(conversation: List[ChatMessage]) -> Optional[ChatMessage]: This abstract method must be implemented by the agent. It takes a list of ChatMessage objects representing the conversation and returns a single ChatMessage object as the response. The code can use any dependencies injected into the __init__ method of the concrete agent class.
tip

Detailed information on what arguments are passed to the __init__ method of the concrete agent class can be found in the How to Configure Agents guide.

StreamableChatAgent

Source

The StreamableChatAgent class extends ChatAgent and adds support for streaming responses. It has an additional abstract method execute_streaming that returns an asynchronous generator of ChatMessage objects.

Methods

  • execute_streaming(conversation: List[ChatMessage]) -> AsyncGenerator[ChatMessage, None]: This abstract method must be implemented by the agent. It takes a list of ChatMessage objects representing the conversation and returns an asynchronous generator of ChatMessage objects as the response. The code can use any dependencies injected into the __init__ method of the concrete agent class.
tip

The StreamableChatAgent class includes an implementation of the execute method that internally calls execute_streaming and collects the results into a list before returning them. This allows agents that only need to implement execute_streaming to work as non-streaming agents as well.

AgentDependencyFactory

Source

The AgentDependencyFactory class is a base class for factories that create injectable dependencies for agents. It provides a create method that should be implemented by subclasses to create the required dependency object.

Methods

  • create(*args, **kwargs) -> T: This abstract method must be implemented by subclasses to create the required dependency object based on the provided configuration parameters. The return type T should be the type of the dependency object. The method can take any number of positional and keyword arguments as needed.
info

Detailed information on what arguments are passed to the create method can be found in How to Create Injectable Dependencies guide.

Class Registration

ChatAgentFactory

Source

The ChatAgentFactory class is a factory for registering implementations of ChatAgent or StreamableChatAgent classes and creating instances of them based on configuration parameters. The create method is called inside the REST API handlers to create the required agent instance.

Methods

  • register(): A decorator for registering agent class implementations with the factory.
warning

All agent classes must be registered with the factory using the register decorator in order for it to be available in the REST API. For example:

@ChatAgentFactory.register()
class MyAgent(ChatAgent):
...

AgentDependencyRegistry

Source

The AgentDependencyRegistry class is used for registering implementations of AgentDependencyFactory classes or instances via its register method. It is used by the dependency injection system to create and inject dependencies into agent instances.

Methods

  • register(inst_or_cls: Union[Type[AgentDependencyFactory], AgentDependencyFactory]): A method for registering implementations of AgentDependencyFactory classes or instances. It takes either a class or an instance of a factory and registers it for creating dependencies.
tip

Example usage:

class MyDependencyFactory(AgentDependencyFactory):
def create(self, param1: str, param2: int) -> MyDependency:
return MyDependency(param1, param2)

AgentDependencyRegistry.register(MyDependencyFactory)

Domain Models

ToolsRegistry

Source

The ToolsRegistry class is used for registering and managing tools (functions) that can be used by agents and completion clients. This class provides methods to add tools and retrieve them by name.

  • tools_index: A dictionary that stores the registered tools. The keys are the names of the tools, and the values are the Tool objects.
  • add(executable: Callable, name: Optional[str] = None, description: Optional[str] = None): Registers a tool with the registry.
    • executable: The callable (function) to register as a tool.
    • name: An optional name for the tool. If not provided, the qualified name of the callable will be used.
    • description: An optional description for the tool. If not provided, the docstring of the callable will be used.
info

Example usage:

def my_tool(param1: str, param2: int) -> str:
"""
This is an example tool that takes two parameters and returns a string.
"""
return f"param1: {param1}, param2: {param2}"

tools_registry = ToolsRegistry()
tools_registry.add(executable=my_tool, name="example_tool", description="An example tool.")

Tool

Source

The Tool class represents a tool that can be used by agents and completion clients. It contains the following fields:

  • name: The name of the tool.
  • description: A description of the tool.
  • executable: The callable (function) that implements the tool.
  • get_parameters_spec() -> Dict[str, Any]: Returns a JSON schema of the parameters of the tool. The schema is extracted from the parameters and type hints of the executable function. It supports primitive types as well as Pydantic BaseModel subclasses.

ChatMessage

Source

The ChatMessage class represents a message in a conversation. It contains the following fields:

  • sender: The sender of the message (ChatMessageSender).
  • content: The content of the message.
  • image_uri: An optional URI of an image associated with the message. This requires the agent to use a multimodal model.
  • evidences: An optional list of ChatMessageEvidence objects. Agents can use this field to provide evidence for their responses.
  • function_call_request: An optional FunctionCallRequest object. Some agents may need the client to call a function on their behalf. This object can be used to specify the function to call and its parameters.
  • function_specs: An optional FunctionSpec object. It contains the specification of the function call returned in function_call_request.

ChatMessageSender

The ChatMessageSender enum represents the sender of a message. It has the following values:

  • USER: The message was sent by the user.
  • BOT: The message was sent by the bot.

ChatMessageEvidence

The ChatMessageEvidence class represents evidence associated with a message. It contains the following fields:

  • document_hit_url: The URL of the document that contains the evidence. Calling this URL should return the document or chunk along with its metadata, content, bounding boxes (for the case of document chunks), etc.
  • text_extract: An optional text extract from the document that contains the evidence.
  • anchor_text: An optional anchor text that can be used to link the evidence to the agent message.

FunctionCallRequest

The FunctionCallRequest class represents a request to call a function. It contains the following fields:

  • name: The name of the function to call.
  • params: An optional dictionary of parameters to pass to the function.

FunctionSpec

The FunctionSpec class represents the specification of a function. The parameters of the function are described using the OpenAPI 3.0 schema. It contains the following fields:

  • name: The name of the function.
  • description: A description of the function.
  • parameters: A dictionary of parameters for the function.

The following is an example of a FunctionSpec object:

{
"name": "get_weather",
"description": "Get the current weather for a location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location for which to get the weather.",
},
},
"required": ["location"],
},
}

AgentSetup

Source

The AgentSetup class represents the setup configuration for an agent. During local development, this is read from a JSON file. For more details on this file and its relation to the AgentSetup class, see the How to Configure Agents guide. The class contains the following fields:

  • agent_identifier: A unique identifier for the agent. This is the value that is used in the API to select the agent.
  • agent_name: The name of the agent. This should match the agent_name attribute in the ChatAgent class.
  • llm_client_configuration: An optional configuration for the language model client used by the agent (LLMClientConfiguration).
  • agent_configuration: An optional custom configuration for the agent. This can include any key-value pairs that the agent needs.
  • sub_agent_mapping: An optional mapping of sub-agent names to sub-agent identifiers. This is used to assign different configurations to the nested agents.

LLMClientConfiguration

Source

The LLMClientConfiguration class represents the configuration for the language model client used by an agent. It contains the following fields:

  • vendor: The vendor of the language model (LLMProviderName).
  • vendor_configuration: Vendor-specific configuration options (LLMVendorConfiguration).
  • model_configuration: Configuration for the language model (LLMModelConfiguration).

LLMProviderName

The LLMProviderName enum represents the vendor of the language model. It has the following values:

  • OPENAI: The language model is provided by OpenAI.
  • ANTHROPIC: The language model is provided by Anthropic.

LLMVendorConfiguration

The LLMVendorConfiguration class represents vendor-specific configuration selectors for the language model. It mirrors LLMProviderName and contains the following fields:

  • openai: An optional configuration for OpenAI (OpenAIConfiguration).
  • anthropic: An optional configuration for Anthropic (AnthropicConfiguration).

OpenAIConfiguration

The OpenAIConfiguration class represents the configuration for the OpenAI language model. It contains the following fields:

  • openai_api_key: The API key for the OpenAI language model (EncryptedStr).
  • openai_org: The organization ID for the OpenAI language model (EncryptedStr).
  • openai_api_type: The type of the OpenAI API (optional).
  • openai_api_base: The base URL for the OpenAI API (optional).
  • openai_api_version: The version of the OpenAI API (optional).

AnthropicConfiguration

The AnthropicConfiguration class represents the configuration for the Anthropic language model. It contains the following fields:

  • anthropic_api_key: The API key for the Anthropic language model (EncryptedStr).
  • anthropic_api_type: The type of the Anthropic API (optional). For example, "bedrock" to use model via AWS.
  • anthropic_api_base: The base URL for the Anthropic API (optional).
  • aws_secret_key: The AWS secret key for the Anthropic language model (EncryptedStr). This is required if anthropic_api_type is set to "bedrock".
  • aws_access_key: The AWS access key for the Anthropic language model (EncryptedStr). This is required if anthropic_api_type is set to "bedrock".
  • aws_region: The AWS region for the Anthropic language model (optional). This is required if anthropic_api_type is set to "bedrock".

LLMModelConfiguration

The LLMModelConfiguration class represents the configuration for the language model. It contains the following fields:

  • name: The name of the language model.
  • type: The type of the language model (LLMModelType).
  • temperature: The temperature for sampling from the language model.
  • json_output: Whether to output the response in JSON format.
  • max_tokens: The maximum number of tokens to generate in the response.

LLMModelType

The LLMModelType enum represents the type of the language model. It has the following values:

  • CHAT: A chat-based language model.
  • PROMPT: A language model that generates the completion of a prompt.
  • PROMPT_WITH_LOGITS: A language model that generates the completion of a prompt with logits.

EncryptedStr

The EncryptedStr class is a mechanism for preventing accidental leakage of secrets through logs or exception dumps. It is a subclass of str that checks if the string is an encrypted cipher text or a plain secret text. If it is a cipher text, then the string object is initialized with it. This is normally used when storing secrets in a database. If the string is a plain text secret, then the string object is initialized with an empty value and secret is stored as a private value.

Once the EncryptedStr object reaches its destination where the plain secret is needed, then the get_unencrypted_secret method needs to be called. This method will return the plain secret text.

Built-in Implementations

ZAVRetriever

Source

The ZAVRetriever class is a wrapper of the Zeta Alpha Search API. It provides methods for searching and retrieving documents.

Methods

  • search(...) -> Dict: Searches for documents based on the provided parameters. The full parameter list documentation can be found in the Document Search API documentation. A simple example of its usage is in the Building and Running Your First RAG Agent tutorial.
  • retrieve(document_hit_url: str) -> Optional[Dict]: Retrieves a document based on the provided document hit URL. The document hit URL is one of the parameters returned by the search method. It is also returned by the agent in the ChatMessageEvidence object.
  • list(retrieval_unit: Literal["document", "chunk"] = "document", property_name: str = "id", property_values: List[str] = [], index_id: Optional[str] = None) -> Dict: Lists documents or chunks based on the provided parameters. For example to find a chunk by its ID, you could use list(retrieval_unit="chunk", property_name="id", property_values=["the_chunk_id"]).

ZAVLangchainStore

Source

The ZAVLangchainStore class implements a LangChain document store that provides a LangChain document retriever ZAVLangchainRetriever.

Methods

  • as_retriever(...) -> ZAVLangchainRetriever: The parameters are the same as the search method of the ZAVRetriever class, except for the query_string parameter. The query_string parameter is instead passed into the ZAVLangchainRetriever via the usual LangChain retriever methods.
note

Extra arguments passed to as_retriever are propagated to LangChain's BaseRetriever class.

ZAVLangchainRetriever

Source

The ZAVLangchainRetriever class is an implementation of LangChain's BaseRetriever class. It provides the ability to retrieve content from the Zeta Alpha platform. Refer to LangChain's documentation for more information on how to use the BaseRetriever class.

ZAVChatCompletionClient

Source

The ZAVChatCompletionClient class is an implementation of a chat completion client that interacts with the Zeta Alpha API. It provides methods for generating chat completions.

Methods

  • complete(messages: Optional[List[ChatMessage]] = None, completions: Optional[List[ChatCompletion]] = None, max_tokens: int = 2048, bot_setup_description: Optional[str] = None, functions: Optional[List[Dict]] = None, tools: Optional[List[Dict]] = None, tool_choice: Optional[str] = None, stream: Literal[True] = True) -> Union[AsyncIterator[ChatResponse], ChatResponse]: Generates chat completions based on the provided parameters.
    • messages: A list of chat messages that have been exchanged between the user and the bot. This is the same type as the conversation parameter in the ChatAgent.execute method.
    • completions: The history of completions that have been exchanged between the agent and the underlying language model. You should only use one of messages or completions at a time. completions has a type that is closer to what the language model expects. Sometimes, you may want the agent to have an "internal" conversation with the language model, and in that case, you can use completions. For example, you may want the language model to decide which internal function the agent should execute next, without communicating this to the user.
    • max_tokens: The maximum number of tokens that the language model should generate.
    • bot_setup_description: This is mapped to the "system message" or equivalent in the language model. It is a message that is sent to the language model at the beginning of the conversation. It can be used to set up the language model's internal state and behavior.
    • functions: A list of functions that the language model could choose for the agent to execute. The schema of the dictionary is the same as the FunctionSpec type. You should only pass one of functions or tools at a time.
    • tools: An instance of ToolsRegistry which informs the LLM what tools it can use. The LLM will decide which tools to use based on the context. Under the hood, the ToolsRegistry converts the tools into a list of schemas that the LLM understands. You can also pass the tools directly as a list of dictionaries. The schema of each element is
      {
      "type": str, # "function" is the only supported type
      "function": FunctionSpec
      }
    • tool_choice: This parameter is used to control how the language model chooses between the tools that are provided in the tools parameter. The possible values are:
      • "auto": The language model will automatically choose which tool to use.
      • "required": The language model will always choose one of the tools provided in the tools parameter.
      • "none": The language model will not choose any tool and will only generate a user-facing message.
      • {"type": "function", "function": {"name": "<tool name>"}}: The language model will always choose the tool with the specified name "<tool name>".
    • stream: If True, the method returns an asynchronous iterator that yields ChatResponse objects. Otherwise, the method returns a single ChatResponse object.
    • execute_tools: (Only used when ToolsRegistry is passed to tools) A boolean flag that indicates whether the client should execute the LLM-selected tools right away. After the tools are executed, the LLM will be called again with the tools' outputs and finally return the response to the agent. The default value is False. When disabled, the agent is responsible for executing the tools and returning the response back to the client.
info

When streaming is enabled, the ChatResponse object contains the accumulated completions generated by the language model. The agent code can use these directly without having to implement the accumulation logic.

Types

ChatCompletion

A data class that represents a completion that the language model has generated. It has the following attributes:

  • sender: ChatCompletionSender: The sender of the completion. The possible values are:
    • "user": The user has sent the completion.
    • "bot": The bot has sent the completion.
    • "function": The completion is a function that the agent has executed.
    • "tool": The completion is a tool that the agent has used.
  • content: str: The content of the completion.
  • image_url: Optional[str]: The Data URL of an image that is associated with the completion.
  • function_call_request: Optional[FunctionCallRequest]: The function call that the language model has chosen for the agent to execute.
  • function_call_response: Optional[FunctionCallResponse]: The function call response that the agent has received after executing the function call request.
  • tool_call_requests: Optional[List[ToolCallRequest]]: The tool calls that the language model has chosen for the agent to use.
  • tool_call_responses: Optional[List[ToolCallResponse]]: The tool call responses that the agent has received after using the tools.
FunctionCallResponse

A data class that represents a function call response. It has the following attributes:

  • name: str: The name of the function that was called.
  • response: Optional[str]: The response of the function call. This typically a stringified JSON object.
ToolCallRequest

A data class that represents a tool call request. It has the following attributes:

  • id: str: The ID of the tool.
  • function_call_request: FunctionCallRequest: The function call that the agent should execute using the tool.
ToolCallResponse

A data class that represents a tool call response. It has the following attributes:

  • id: str: The ID of the tool.
  • response: Optional[str]: The response of the tool call. This typically a stringified JSON object.
ChatResponse

A data class that represents the response of the complete method. It has the following attributes:

  • error: Optional[Exception]: An exception that occurred during the completion generation process.
  • chat_completion: Optional[ChatCompletion]: The completion that the language model has generated.