Building and Running Your First LangChain RAG Agent
In this tutorial, we will build and run our first Retrieval-Augmented Generation (RAG) agent using the LangChain framework. LangChain is a popular high-level interface for building complex language model applications, making it easier to integrate various components like retrievers and language models.
Prerequisites
Before you begin, make sure you have completed the Getting Started with the Agents SDK tutorial.
Next, you will need to install the extra requirements for the LangChain framework. Run the following command in your terminal:
pip install zetaalpha.rag-agents[langchain]
Step 1: Create a new Agent
Change to the <agents project>
directory that you created in the getting started tutorial and run the following command to create a new agent:
rag_agents new "my_langchain_rag_agent"
This command will create a new agent file my_langchain_rag_agent.py
in the <agents project>
directory. The directory structure should now look like this:
<agents project>/
├── .gitignore
├── __init__.py
├── my_langchain_rag_agent.py
├── agent_setups.json
└── env/
└── agent_setups.json
Step 2: Writing Your LangChain RAG Agent
Open the my_langchain_rag_agent.py
file in your project directory and replace its content with the following code:
from typing import List, Optional
from langchain import hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
from zav.agents_sdk import (
ChatAgent,
ChatAgentFactory,
ChatMessage
)
from zav.agents_sdk.adapters import ZAVLangchainStore
@ChatAgentFactory.register()
class LangChainRAG(ChatAgent):
agent_name = "my_langchain_rag_agent"
def __init__(
self,
zav_langchain_store: ZAVLangchainStore,
client: ChatOpenAI,
):
self.retriever = zav_langchain_store.as_retriever()
self.llm = client
async def execute(self, conversation: List[ChatMessage]) -> Optional[ChatMessage]:
prompt = hub.pull("rlm/rag-prompt")
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{"context": self.retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| self.llm
| StrOutputParser()
)
answer_msg = await rag_chain.ainvoke(conversation[-1].content)
return ChatMessage(
sender="bot",
content=answer_msg,
)
The agent defines a chain of runnables that include a retriever, a language model, and an output parser. The retriever fetches relevant documents, the language model generates the answer, and the output parser formats the response.
Step 3: Running Your Agent Locally
Running in the UI
You can run and test your agent in the UI by running the following command:
rag_agents dev --reload
Here you can chat with your agent and test its functionality.
Running as an API
Alternatively, you can serve your agent as an API by running the following command:
rag_agents serve --reload
To test your agent, you can use the Swagger UI available at http://localhost:8000/docs
or send a POST request to the /chats/responses
endpoint. Here's an example using curl
:
curl -X POST "http://localhost:8000/v1/chats/responses?tenant=zetaalpha" \
-H "Content-Type: application/json" \
-d '{
"agent_identifier": "my_langchain_rag_agent",
"conversation": [
{"sender": "user", "content": "What is a transformer?"}
]
}'
You should receive a response from your agent with the answer to your query.