Agents
An AI Agent with Memory & Tools, in LangChain
The smallest complete example of a real agent: an LLM that can call your tools and remembers the conversation across turns. The three pieces, why each matters, and the LangChain code that wires them together.
Strip away the hype and an “AI agent” is three parts bolted together: a language model that decides what to do, a set of tools it's allowed to call, and a memory so it doesn't forget what just happened. Here is the smallest version that actually works — built with LangChain, end to end.
The three pieces
Before any code, the mental model. An agent loop is just this: the LLM reads your message and decides whether it can answer directly or needs help. If it needs help, it emits a tool call — a structured request like get_weather("Tokyo"). Your code runs that tool, hands the result back, and the model continues. Wrap the whole exchange in memory and the agent can reason across turns instead of starting cold every time.
1. The LLM and the tools
Start with the model and the tools it's allowed to use. In LangChain a tool is just a function with the @tool decorator — the docstring matters, because that's how the model learns what the tool does and when to reach for it.
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
# The reasoner
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# The hands — the docstring is the model's instruction manual
@tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
return f"It's 22°C and sunny in {city}."
@tool
def add(a: int, b: int) -> int:
"""Add two numbers together."""
return a + b
tools = [get_weather, add]2. Wiring the agent
Next, give the model a prompt with two special slots: one for chat_history (memory drops in here) and one for agent_scratchpad (where the model's in-progress tool calls live). Then create_tool_calling_agent binds the LLM, the tools, and the prompt together, and AgentExecutor runs the call-a-tool-then-continue loop for you.
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import create_tool_calling_agent, AgentExecutor
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Use tools when they help."),
MessagesPlaceholder("chat_history"), # <- memory goes here
("human", "{input}"),
MessagesPlaceholder("agent_scratchpad"), # <- tool calls go here
])
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)3. Adding memory
The executor above is stateless — it forgets everything between calls. To give it memory, wrap it in RunnableWithMessageHistory. You supply a function that returns a message history for a given session_id, and LangChain automatically reads past turns in and writes new ones out. Here the store is an in-memory dict; in production you'd back it with Redis or a database.
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
store = {} # swap for Redis/DB in production
def get_history(session_id: str) -> ChatMessageHistory:
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
agent = RunnableWithMessageHistory(
executor,
get_history,
input_messages_key="input",
history_messages_key="chat_history",
)Running it
Now invoke it with a session_id. Reuse the same id and the agent remembers — the third message below works only because the first two are still in memory.
config = {"configurable": {"session_id": "user-1"}}
agent.invoke({"input": "What's the weather in Tokyo?"}, config)
# -> calls get_weather("Tokyo"), answers with the result
agent.invoke({"input": "Now add 21 and 21."}, config)
# -> calls add(21, 21), answers "42"
agent.invoke({"input": "What was the first thing I asked you?"}, config)
# -> "You asked about the weather in Tokyo." (only works because of memory)An agent is just an LLM that's allowed to call your functions, with the conversation handed back to it each turn. Everything else is plumbing.
How a single turn actually flows
When you call agent.invoke, here's the loop the executor runs under the hood:
- Memory loads past messages into
chat_history; your input is appended. - The LLM responds either with a final answer or with one or more tool calls.
- If it's a tool call, the executor runs the matching function and feeds the result back into
agent_scratchpad. - The model sees the tool result and loops again — until it produces a final answer.
- That answer (and the input) get written back to memory for next time.
What to watch for
Two traps. First, vague tool docstrings — the model decides which tool to call from the docstring, so “does stuff with numbers” will get you wrong calls; be specific. Second, unbounded memory — naive history grows every turn until you blow the context window and your bill. For long conversations, trim or summarize old turns instead of replaying all of them.