Build a Working AI Agent With LangChain in 5 Steps
Learn how to build an AI agent with LangChain and LangGraph that uses tools, reasons through problems, and remembers conversations — complete with working Python code you can run in 30 minutes.
Learn how to build an AI agent with LangChain and LangGraph that uses tools, reasons through problems, and remembers conversations — complete with working Python code you can run in 30 minutes.

What if you could build an AI agent — one that reasons, picks up tools, and remembers past conversations — in under 100 lines of Python?
That's exactly what we're doing today. LangChain has become the go-to framework for building AI agents, and for good reason. As of April 2, 2026, its ecosystem (including LangGraph for agent orchestration) gives you everything you need to go from zero to a working agent in about 30 minutes. No PhD required.
You build an AI agent with LangChain by combining an LLM with custom tools using LangGraph's create_react_agent function. The agent reasons through problems using the ReAct pattern — it thinks about the question, selects the right tool, observes the result, and repeats until it has an answer. That's the short version. Here's the full walkthrough.
By the end of this guide, you'll have a Python-based AI agent that:
The agent follows the ReAct (Reasoning + Acting) pattern. Think of it like handing an LLM a toolbox and a decision-making framework — it examines the problem, picks the right tool, uses it, checks the result, and repeats until it has an answer.
Before we start, make sure you have:
That's it. You don't need a GPU, a cloud server, or prior experience with AI frameworks.
Create a new project directory and virtual environment:
mkdir langchain-agent && cd langchain-agent
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
Install the required packages:
pip install langchain langchain-openai langgraph langchain-community tavily-python
Here's what each package does:
Set your API keys as environment variables:
export OPENAI_API_KEY="your-openai-key-here"
export TAVILY_API_KEY="your-tavily-key-here"
You can grab a free Tavily API key at tavily.com — the free tier gives you 1,000 searches per month, which is plenty for development.
Tools are what separate an agent from a chatbot. A chatbot can only talk. An agent can do things.

Create a file called agent.py and start with your tool definitions:
from langchain_core.tools import tool
from langchain_community.tools.tavily_search import TavilySearchResults
# Built-in web search tool
search = TavilySearchResults(max_results=3)
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression and return the result.
Args:
expression: A valid Python math expression like '2 + 2' or '(15 * 3) / 4.5'
"""
try:
result = eval(expression, {"__builtins__": {}}, {})
return f"Result: {result}"
except Exception as e:
return f"Error: {str(e)}"
@tool
def get_current_date() -> str:
"""Return today's date."""
from datetime import date
return str(date.today())
tools = [search, calculate, get_current_date]
A few things to notice. The @tool decorator turns any Python function into something an LLM can call. The docstring matters a lot — it's what the agent reads to decide when to use each tool. Vague docstrings lead to bad tool selection. Be specific.
And yes, we're using eval() for the calculator with builtins disabled for safety. For a production agent, you'd want something more locked down — but this works for learning.
Tools are just Python functions with good docstrings. If you can write a function, you can give your agent a new ability.
Now for the exciting part. LangGraph's create_react_agent handles all the orchestration — the reasoning loop, tool calling, and response formatting — in a single function call:
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
# Initialize the LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Create the agent
agent = create_react_agent(llm, tools)
# Run it
response = agent.invoke(
{"messages": [("user", "What's the population of Tokyo multiplied by 2?")]}
)
# Print the final response
for message in response["messages"]:
print(f"{message.type}: {message.content}")
That's your entire agent. Under the hood, create_react_agent builds a graph where the LLM node can either respond directly or call a tool. If it calls a tool, the result feeds back into the LLM, which decides whether to call another tool or give a final answer.
The ReAct loop is beautifully simple: Think → Act → Observe → Repeat. LangGraph just makes the plumbing disappear.

Want to use Claude instead of GPT-4o? Just swap the model:
# pip install langchain-anthropic
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-sonnet-4-6", temperature=0)
agent = create_react_agent(llm, tools)
As of April 2, 2026, Claude Opus 4.6 scores 71.1% on SWE-bench Verified (self-reported by Anthropic), making Anthropic's lineup a strong choice for agents that reason about code. But Sonnet 4.6 hits a sweet spot for most agent use cases: strong performance at $3/$15 per million tokens (input/output), compared to GPT-4o's $2.50/$10. The price difference is modest, and both work well for tool-calling tasks.
You can also use Google's Gemini models via langchain-google-genai or run local models through Ollama. The LangChain docs list every supported provider.
Without memory, your agent forgets everything between calls. That's fine for one-shot questions, but useless for anything resembling a real conversation.
LangGraph handles memory through checkpointers — objects that save and restore conversation state:
from langgraph.checkpoint.memory import MemorySaver
# Create a memory-backed checkpointer
memory = MemorySaver()
# Rebuild the agent with memory
agent = create_react_agent(llm, tools, checkpointer=memory)
# Use thread_id to maintain separate conversations
config = {"configurable": {"thread_id": "session-1"}}
# First message
response1 = agent.invoke(
{"messages": [("user", "My name is Alex and I'm building a weather app.")]},
config
)
# Second message — the agent remembers the first
response2 = agent.invoke(
{"messages": [("user", "What did I say I was building?")]},
config
)
# Agent responds: "You said you're building a weather app."
MemorySaver stores everything in RAM — perfect for development. For production, switch to a persistent backend:
# pip install langgraph-checkpoint-sqlite
from langgraph.checkpoint.sqlite import SqliteSaver
memory = SqliteSaver.from_conn_string("agent_memory.db")
agent = create_react_agent(llm, tools, checkpointer=memory)
So now your agent picks up conversations where they left off, even after a restart. Different thread_id values keep separate users' conversations isolated from each other.
Here's the complete working agent in one file:
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain_community.tools.tavily_search import TavilySearchResults
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver
# --- Tools ---
search = TavilySearchResults(max_results=3)
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression and return the result."""
try:
result = eval(expression, {"__builtins__": {}}, {})
return f"Result: {result}"
except Exception as e:
return f"Error: {str(e)}"
# --- Agent ---
llm = ChatOpenAI(model="gpt-4o", temperature=0)
memory = MemorySaver()
agent = create_react_agent(llm, [search, calculate], checkpointer=memory)
# --- Run ---
def chat(message: str, thread_id: str = "default"):
config = {"configurable": {"thread_id": thread_id}}
response = agent.invoke({"messages": [("user", message)]}, config)
return response["messages"][-1].content
if __name__ == "__main__":
print("Agent ready. Type 'quit' to exit.\n")
while True:
user_input = input("You: ")
if user_input.lower() == "quit":
break
print(f"Agent: {chat(user_input)}\n")
Run it:
python agent.py
Try asking things like:

After building agents for a while, certain mistakes come up over and over. Here's what to watch for:
1. Vague tool descriptions. If your tool's docstring says "does stuff," the agent won't know when to call it. Be explicit about inputs, outputs, and purpose.
2. Too many tools. LLMs get confused when given 20+ tools at once. Start with 3–5 and add more only when needed.
3. No error handling in tools. If a tool throws an unhandled exception, the entire agent crashes. Always wrap tool logic in try/except and return a meaningful error message — the agent can often recover if you tell it what went wrong.
4. Ignoring token costs. Every tool call generates extra tokens. A search tool that returns 10 full web pages will burn through your API budget fast. Limit result sizes.
5. Skipping temperature=0. For agents, you almost always want temperature=0. Higher temperatures make tool-calling unreliable — the LLM might hallucinate function names or pass malformed arguments.
The biggest mistake beginners make? Building too much, too fast. Start with one tool. Get it working perfectly. Then add another.
Don't just eyeball the output. But do run these basic checks before calling your agent "done":
And if a tool keeps getting ignored, the fix is almost always the docstring. Rewrite it to be more specific about when the tool should be used.
You've got a working agent. Here's where to go from here:
As of April 2, 2026, the LangChain ecosystem is moving fast. LangGraph gets new features monthly, and the community-maintained tool integrations keep growing. So bookmark the docs and check back often — what's possible today will look quaint in six months.
Sources
Yes. You can run LangChain agents with local models through Ollama, which supports Llama 4 Maverick, Mistral Large 2, and other open-weight models. Install langchain-ollama, then use ChatOllama(model='llama4') as your LLM. Performance depends on your hardware — you'll want at least 16GB of RAM for 7B-parameter models and a decent GPU for anything larger.
A typical agent query that involves one search and one LLM call costs roughly $0.01–$0.05 with GPT-4o ($2.50/$10 per million tokens). Costs scale with the number of tool calls per query — an agent that chains 5 tools will cost 3–5x more than a single-tool call. Claude Sonnet 4.6 at $3/$15 per million tokens is comparable. For high-volume use cases, consider caching repeated searches and limiting max_results on search tools.
LangChain agents (via LangGraph) give you full control over the reasoning loop, tool definitions, and state management — you define exactly what the agent can do. AutoGPT is a fully autonomous agent that generates its own goals and subtasks with minimal human oversight. For production applications, LangChain agents are generally more reliable because you control the scope. AutoGPT is better for open-ended experimentation.
Yes. LangGraph supports parallel tool execution when the LLM requests multiple tool calls in a single step. Models like GPT-4o and Claude Sonnet 4.6 support parallel tool calling natively — the agent sends multiple tool requests, LangGraph executes them concurrently, and the results all feed back to the LLM at once. This can significantly speed up agents that need to gather information from several sources.
The fastest path is wrapping your agent in a FastAPI endpoint. Create a POST route that accepts a message and thread_id, calls agent.invoke(), and returns the response. For production, LangGraph Platform offers managed deployment with streaming, input validation, and built-in monitoring. You can deploy to any platform that runs Python — AWS Lambda, Google Cloud Run, Railway, or a simple VPS.