Function call request

Prerequisites

Make sure you have completed the Getting Started with the Chat API tutorial.

Handling a function call request

Some agents may suggest a function call request that needs to be executed by the API client. For example, there is a built-in agent named verbose_qa_with_dynamic_retrieval, which can be enabled for your tenant upon request, that can suggest a search request to be performed by the API client, in order to get new context before answering the user's question.

For example, this functionality could support the following use case:

The user asks a follow-up question about CLIP.
The agent realizes that the current context does not have any information about CLIP and that a new search should be performed.
The agent suggests a search function call request to be performed by the API client, in order to retrieve a new set of documents.
The API client performs the search request to get the new context and display it to the user and meanwhile calls the chat API again with the new context.
The agent now responds to the user's question based on the updated context.

Below you can see how a search function call request is returned. The streaming API is used but the functionality is the same for the REST endpoint as well. Note that the full function call request is streamed at once in order to avoid sending partial JSON params to the streamed object.

import json
import os

import requests
import sseclient

TENANT = "zetaalpha"
CHAT_STREAMING_ENDPOINT = (
    f"https://api.zeta-alpha.com/v0/service/chat/stream?tenant={TENANT}"
)

headers = {
    "accept": "text/event-stream",
    "Content-Type": "application/json",
    "x-auth": os.getenv("ZETA_ALPHA_API_KEY"),
}

response = requests.post(
    CHAT_STREAMING_ENDPOINT,
    headers=headers,
    json={
        "conversation_context": {
            "document_context": {
                "document_ids": [
                    "73effa5a188b69d32b5889a5ed564db5b66aeeb6_0",
                    "6277484c482c12e80166b3388ef8069b088dfcb6_0",
                    "6c42c17b131d886f0ccf4897d055e42580574240_0",
                    "df40f22694ea7515ef8cd321d877e54c30d336ca_0",
                    "bea6364917260019f43a72a3906d9a417029b9be_0",
                ],
                "retrieval_unit": "document",
            }
        },
        "conversation": [
            {
                "sender": "user",
                "content": "What is BERT?",
            },
            {
                "sender": "bot",
                "content": "BERT stands for Bidirectional Encoder Representations from Transformers. It is a transformer-based language model developed by Google and released in late 2018. BERT represents a significant advancement in natural language processing (NLP) due to its ability to understand the context of words in relation to all other words in a sentence, rather than just the words that come before or after them. This bidirectional approach allows BERT to capture the full context of a word, making it particularly effective for various language tasks.\n\nBERT is pre-trained on a large corpus of text and can be fine-tuned for specific tasks such as sentiment analysis, question answering, and named entity recognition. The model consists of multiple encoder layers and self-attention heads, which enable it to process and generate contextualized word embeddings. Fine-tuning BERT for specific tasks typically involves adding a small classification layer on top of the model, allowing it to adapt to the requirements of the task at hand <sup>4</sup><sup>5</sup>.",
                "evidences": [
                    {
                        "document_hit_url": "/documents/document/list?tenant=zetaalpha&index_cluster=default:None&property_name=id&property_values=df40f22694ea7515ef8cd321d877e54c30d336ca_0",
                        "text_extract": "What is BERT? <b>In 2018 Google developed a transformer-based NLP pretraining model called BERT or Bidirectional Encoder Representations from Transformers.</b> It is nothing but a Transformer language model with multiple encoder layers and self-attention heads.",
                        "anchor_text": "<sup>4</sup>",
                    },
                    {
                        "document_hit_url": "/documents/document/list?tenant=zetaalpha&index_cluster=default:None&property_name=id&property_values=bea6364917260019f43a72a3906d9a417029b9be_0",
                        "text_extract": "Breaking BERT Down\nShreya Ghelani\nBreaking BERT Down BERT is short for Bidirectional Encoder Representations from Transformers. <b>It is a new type of language model developed and released by Google in late 2018.</b> Pre-trained language models like BERT play…\nAt the output, the token representations are fed into an output layer for token level tasks, such as sequence tagging or question answering, and the [CLS] representation is fed into an output layer for classification, such as entailment or sentiment analysis.",
                        "anchor_text": "<sup>5</sup>",
                    },
                ],
            },
            {
                "sender": "user",
                "content": "What is CLIP?",
            },
        ],
        "agent_identifier": "verbose_qa_with_dynamic_retrieval",
    },
    stream=True,
)

response.raise_for_status()
client = sseclient.SSEClient(response)

for event in client.events():
    try:
        streamed_data = json.loads(event.data)
        print(f"Data stream: {streamed_data}")
    except Exception:
        print(f"Data stream error: {event.data}")
        streamed_data = None

if streamed_data:
    print("\n---------------- COMPLETE MESSAGE ----------------")
    print(f"Message:\n{streamed_data['content']}\n")
    print(f"Evidences:\n{streamed_data['evidences']}\n")
    print(f"Function Call:\n{streamed_data['function_call_request']}\n")
    print("--------------------------------------------------")

Sample final output:

...
---------------- COMPLETE MESSAGE ----------------
Message:


Evidences:
None

Function Call:
{'name': 'document_search', 'params': {'search_engine': 'zeta_alpha', 'retrieval_method': 'mixed', 'query_string': 'What is CLIP?', 'document_type': ['document']}}

--------------------------------------------------

Prerequisites​

Handling a function call request​

Prerequisites

Handling a function call request