Welcome back!
In Chapter 3: Runnables & Chains, we learned to build Chains. A Chain is like a train on a track: it goes from Station A to Station B to Station C. It always does the same steps in the same order.
In Chapter 5: Retrieval (Documents & VectorStores), we gave our application access to external data ("Reading a book").
But what if we don't know the path ahead of time? What if the user asks: "What is the weather in NY?" (Requires a weather API) Vs. "What is 5 times 5?" (Requires a calculator)
A hardcoded Chain cannot handle both. We need a "Decision Maker." We need an Agent.
An Agent uses the Language Model as a reasoning engine. Instead of just generating text, the model generates Actions.
Think of it like this:
To build an Agent, we need two main parts:
A Tool is just a Python function that connects to the real world. LangChain provides a simple decorator @tool to convert any function into a tool the AI can understand.
Let's create a simple tool that calculates the length of a word.
from langchain_core.tools import tool
# The docstring ("Calculates...") is CRITICAL.
# The AI reads it to know WHEN to use this tool.
@tool
def get_word_length(word: str) -> int:
"""Calculates the length of a word."""
return len(word)
Explanation:
@tool.word: str) so the AI knows what data to send.Let's test the tool manually:
# We can use it like a normal function
print(get_word_length.invoke("LangChain"))
# Output: 9
Now that we have a tool, we need to tell our Chat Model about it. We use the .bind_tools() method.
Note: This requires a model that supports Tool Calling (like OpenAI GPT-3.5/4, Anthropic Claude, etc.).
from langchain_openai import ChatOpenAI
model = ChatOpenAI()
tools = [get_word_length]
# This creates a new version of the model that "knows" about the tools
model_with_tools = model.bind_tools(tools)
Explanation: The model now has a "virtual manual" of the tools available to it. It hasn't called them yet; it just knows they exist.
We have the Brain (Model) and the Hands (Tools). Now we need the Runtime to coordinate them.
If the model says, "I need to run get_word_length with input 'Hello'", something needs to actually run that Python function and feed the result back to the model. This is the job of the AgentExecutor.
LangChain provides pre-built functions to set this up.
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
# 1. Define the Prompt (Standard format for agents)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"), # Important! Memory for intermediate steps
])
# 2. Create the Agent (The logic)
agent = create_tool_calling_agent(model, tools, prompt)
# 3. Create the Executor (The runtime)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
Explanation:
{agent_scratchpad}: This is a special area in the prompt where the Agent writes down its previous actions (e.g., "I tried step 1, and the result was X. Now I need step 2").verbose=True: This lets us see the Agent's "thought process" in the console.Let's ask a question that requires the tool.
response = agent_executor.invoke({"input": "How many letters in the word 'Supercalifragilistic'?"})
print(response["output"])
What happens in the console (Verbose mode):
get_word_length. I should call it.get_word_length("Supercalifragilistic").20 (hidden from user, seen by AI).If we ask a question without needing tools:
agent_executor.invoke({"input": "Hi, how are you?"})
# The Agent skips the tools and answers directly.
How does the Agent know when to stop? How does it move between "Thinking" and "Acting"?
The AgentExecutor runs a continuous loop until the Agent produces a "Final Answer".
AgentAction)When the model decides to do something, it doesn't execute code immediately. It returns an object representing its intent.
File Reference: libs/core/langchain_core/agents.py
class AgentAction(Serializable):
tool: str # Name of tool to use (e.g., "get_word_length")
tool_input: str # Input arguments (e.g., "Hello")
log: str # The LLM's reasoning text
AgentFinish)
When the model has gathered enough information, it returns an AgentFinish object instead of an Action.
File Reference: libs/core/langchain_core/agents.py
class AgentFinish(Serializable):
return_values: dict # The final answer for the user
log: str # Final reasoning log
AgentExecutor)
The heart of the system is the _take_next_step method inside the executor. It checks if the model returned an Action or a Finish.
File Reference: libs/langchain/langchain_classic/agents/agent.py
# Simplified logic of AgentExecutor
def _iter_next_step(self, ...):
# 1. Ask the agent/model what to do next
output = self._action_agent.plan(intermediate_steps, ...)
# 2. If it's a Final Answer, stop.
if isinstance(output, AgentFinish):
yield output
return
# 3. If it's an Action, execute the tool
if isinstance(output, AgentAction):
# Find the tool in our list
tool = name_to_tool_map[output.tool]
# Run it!
observation = tool.run(output.tool_input)
# Yield the result so it can be added to the next prompt
yield AgentStep(action=output, observation=observation)
This loop allows the agent to be dynamic. It might take 1 step, or it might take 5 steps (e.g., "Search Google" -> "Read Link" -> "Summarize" -> "Translate" -> "Final Answer").
In this chapter, we learned:
get_word_length) that we give to the Agent to interact with the world.AgentAction (to request a tool run) and AgentFinish (to return the final answer).The Missing Piece: We have built powerful applications now. But when an Agent is running a complex loop, or a Chain is processing thousands of documents, how do we know what is happening inside? How do we debug errors or stream tokens to the user in real-time?
We need a monitoring system. We need Callbacks.
Generated by Code IQ