Building and Running Your First LangChain RAG Agent

In this tutorial, we will build and run our first Retrieval-Augmented Generation (RAG) agent using the LangChain framework. LangChain is a popular high-level interface for building complex language model applications, making it easier to integrate various components like retrievers and language models.

Prerequisites

Before you begin, make sure you have completed the Getting Started with the Agents SDK tutorial.

Next, you will need to install the extra requirements for the LangChain framework. Run the following command in your terminal:

pip install zetaalpha.rag-agents[langchain]

Step 1: Create a new Agent

Change to the <agents project> directory that you created in the getting started tutorial and run the following command to create a new agent:

rag_agents new "my_langchain_rag_agent"

This command will create a new agent file my_langchain_rag_agent.py in the <agents project> directory. The directory structure should now look like this:

<agents project>/
├── .gitignore
├── __init__.py
├── my_langchain_rag_agent.py
├── agent_setups.json
└── env/
    └── agent_setups.json

Step 2: Writing Your LangChain RAG Agent

Open the my_langchain_rag_agent.py file in your project directory and replace its content with the following code:

from typing import List, Optional

from langchain import hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

from zav.agents_sdk import (
    ChatAgent,
    ChatAgentClassRegistry,
    ChatMessage
)
from zav.agents_sdk.adapters import ZAVLangchainStore


@ChatAgentClassRegistry.register()
class LangChainRAG(ChatAgent):
    agent_name = "my_langchain_rag_agent"

    def __init__(
        self,
        zav_langchain_store: ZAVLangchainStore,
        client: ChatOpenAI,
    ):
        self.retriever = zav_langchain_store.as_retriever()
        self.llm = client

    async def execute(self, conversation: List[ChatMessage]) -> Optional[ChatMessage]:
        prompt = hub.pull("rlm/rag-prompt")

        def format_docs(docs):
            return "\n\n".join(doc.page_content for doc in docs)

        rag_chain = (
            {"context": self.retriever | format_docs, "question": RunnablePassthrough()}
            | prompt
            | self.llm
            | StrOutputParser()
        )

        answer_msg = await rag_chain.ainvoke(conversation[-1].content)
        return ChatMessage(
            sender="bot",
            content=answer_msg,
        )

The agent defines a chain of runnables that include a retriever, a language model, and an output parser. The retriever fetches relevant documents, the language model generates the answer, and the output parser formats the response.

Step 3: Running Your Agent Locally

Running in the UI

You can run and test your agent in the UI by running the following command:

rag_agents dev --reload

Here you can chat with your agent and test its functionality.

Running as an API

Alternatively, you can serve your agent as an API by running the following command:

rag_agents serve --reload

To test your agent, you can use the Swagger UI available at http://localhost:8000/docs or send a POST request to the /chats/responses endpoint. Here's an example using curl:

curl -X POST "http://localhost:8000/chats/responses?tenant=zetaalpha" \
-H "Content-Type: application/json" \
-d '{
  "agent_identifier": "my_langchain_rag_agent",
  "conversation": [
    {"sender": "user", "content": "What is a transformer?"}
  ]
}'

You should receive a response from your agent with the answer to your query.

Prerequisites​

Step 1: Create a new Agent​

Step 2: Writing Your LangChain RAG Agent​

Step 3: Running Your Agent Locally​

Running in the UI​

Running as an API​