Can I build a LangChain agent with local open-source models instead of OpenAI?

Yes. You can run LangChain agents with local models through Ollama, which supports Llama 4 Maverick, Mistral Large 2, and other open-weight models. Install langchain-ollama, then use ChatOllama(model='llama4') as your LLM. Performance depends on your hardware — you'll want at least 16GB of RAM for 7B-parameter models and a decent GPU for anything larger.

How much does it cost to run a LangChain agent per query?

A typical agent query that involves one search and one LLM call costs roughly $0.01–$0.05 with GPT-4o ($2.50/$10 per million tokens). Costs scale with the number of tool calls per query — an agent that chains 5 tools will cost 3–5x more than a single-tool call. Claude Sonnet 4.6 at $3/$15 per million tokens is comparable. For high-volume use cases, consider caching repeated searches and limiting max_results on search tools.

What is the difference between LangChain agents and AutoGPT?

LangChain agents (via LangGraph) give you full control over the reasoning loop, tool definitions, and state management — you define exactly what the agent can do. AutoGPT is a fully autonomous agent that generates its own goals and subtasks with minimal human oversight. For production applications, LangChain agents are generally more reliable because you control the scope. AutoGPT is better for open-ended experimentation.

Can a LangChain agent call multiple tools in parallel?

Yes. LangGraph supports parallel tool execution when the LLM requests multiple tool calls in a single step. Models like GPT-4o and Claude Sonnet 4.6 support parallel tool calling natively — the agent sends multiple tool requests, LangGraph executes them concurrently, and the results all feed back to the LLM at once. This can significantly speed up agents that need to gather information from several sources.

How do I deploy a LangChain agent as a web API?

The fastest path is wrapping your agent in a FastAPI endpoint. Create a POST route that accepts a message and thread_id, calls agent.invoke(), and returns the response. For production, LangGraph Platform offers managed deployment with streaming, input validation, and built-in monitoring. You can deploy to any platform that runs Python — AWS Lambda, Google Cloud Run, Railway, or a simple VPS.

Build a Working AI Agent With LangChain in 5 Steps

What if you could build an AI agent — one that reasons, picks up tools, and remembers past conversations — in under 100 lines of Python?

That's exactly what we're doing today. LangChain has become the go-to framework for building AI agents, and for good reason. As of April 2, 2026, its ecosystem (including LangGraph for agent orchestration) gives you everything you need to go from zero to a working agent in about 30 minutes. No PhD required.

You build an AI agent with LangChain by combining an LLM with custom tools using LangGraph's create_react_agent function. The agent reasons through problems using the ReAct pattern — it thinks about the question, selects the right tool, observes the result, and repeats until it has an answer. That's the short version. Here's the full walkthrough.

What You'll Build

By the end of this guide, you'll have a Python-based AI agent that:

Answers questions by searching the web in real time
Performs calculations on demand
Remembers your conversation across multiple interactions
Runs locally on your machine

The agent follows the ReAct (Reasoning + Acting) pattern. Think of it like handing an LLM a toolbox and a decision-making framework — it examines the problem, picks the right tool, uses it, checks the result, and repeats until it has an answer.

Prerequisites

Before we start, make sure you have:

Python 3.10+ installed
An OpenAI API key (or Anthropic, Google — we'll cover swapping LLMs later)
Basic Python knowledge (functions, decorators, pip)
A terminal and text editor

That's it. You don't need a GPU, a cloud server, or prior experience with AI frameworks.

Step 1: Set Up Your Environment

Create a new project directory and virtual environment:

Bash

mkdir langchain-agent && cd langchain-agent
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

Install the required packages:

Bash

pip install langchain langchain-openai langgraph langchain-community tavily-python

Here's what each package does:

langchain — Core abstractions for chains, prompts, and document loading
langchain-openai — OpenAI model integration
langgraph — The agent orchestration framework (this is where the real magic lives)
langchain-community — Community-maintained tool integrations
tavily-python — A search API built specifically for AI agents

Set your API keys as environment variables:

Bash

export OPENAI_API_KEY="your-openai-key-here"
export TAVILY_API_KEY="your-tavily-key-here"

You can grab a free Tavily API key at tavily.com — the free tier gives you 1,000 searches per month, which is plenty for development.

Step 2: Define Your Agent's Tools

Tools are what separate an agent from a chatbot. A chatbot can only talk. An agent can do things.

Create a file called agent.py and start with your tool definitions:

Python

from langchain_core.tools import tool
from langchain_community.tools.tavily_search import TavilySearchResults

# Built-in web search tool
search = TavilySearchResults(max_results=3)

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression and return the result.

    Args:
        expression: A valid Python math expression like '2 + 2' or '(15 * 3) / 4.5'
    """
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {str(e)}"

@tool
def get_current_date() -> str:
    """Return today's date."""
    from datetime import date
    return str(date.today())

tools = [search, calculate, get_current_date]

A few things to notice. The @tool decorator turns any Python function into something an LLM can call. The docstring matters a lot — it's what the agent reads to decide when to use each tool. Vague docstrings lead to bad tool selection. Be specific.

And yes, we're using eval() for the calculator with builtins disabled for safety. For a production agent, you'd want something more locked down — but this works for learning.

Tools are just Python functions with good docstrings. If you can write a function, you can give your agent a new ability.

Step 3: Build Your LangChain Agent

Now for the exciting part. LangGraph's create_react_agent handles all the orchestration — the reasoning loop, tool calling, and response formatting — in a single function call:

Python

from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

# Initialize the LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Create the agent
agent = create_react_agent(llm, tools)

# Run it
response = agent.invoke(
    {"messages": [("user", "What's the population of Tokyo multiplied by 2?")]}
)

# Print the final response
for message in response["messages"]:
    print(f"{message.type}: {message.content}")

That's your entire agent. Under the hood, create_react_agent builds a graph where the LLM node can either respond directly or call a tool. If it calls a tool, the result feeds back into the LLM, which decides whether to call another tool or give a final answer.

The ReAct loop is beautifully simple: Think → Act → Observe → Repeat. LangGraph just makes the plumbing disappear.

Bar chart showing SWE-bench Verified scores for top AI models in 2026

Swapping LLMs

Want to use Claude instead of GPT-4o? Just swap the model:

Python

# pip install langchain-anthropic
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-6", temperature=0)
agent = create_react_agent(llm, tools)

As of April 2, 2026, Claude Opus 4.6 scores 71.1% on SWE-bench Verified (self-reported by Anthropic), making Anthropic's lineup a strong choice for agents that reason about code. But Sonnet 4.6 hits a sweet spot for most agent use cases: strong performance at $3/$15 per million tokens (input/output), compared to GPT-4o's $2.50/$10. The price difference is modest, and both work well for tool-calling tasks.

You can also use Google's Gemini models via langchain-google-genai or run local models through Ollama. The LangChain docs list every supported provider.

Step 4: Add Memory and State

Without memory, your agent forgets everything between calls. That's fine for one-shot questions, but useless for anything resembling a real conversation.

LangGraph handles memory through checkpointers — objects that save and restore conversation state:

Python

from langgraph.checkpoint.memory import MemorySaver

# Create a memory-backed checkpointer
memory = MemorySaver()

# Rebuild the agent with memory
agent = create_react_agent(llm, tools, checkpointer=memory)

# Use thread_id to maintain separate conversations
config = {"configurable": {"thread_id": "session-1"}}

# First message
response1 = agent.invoke(
    {"messages": [("user", "My name is Alex and I'm building a weather app.")]},
    config
)

# Second message — the agent remembers the first
response2 = agent.invoke(
    {"messages": [("user", "What did I say I was building?")]},
    config
)
# Agent responds: "You said you're building a weather app."

MemorySaver stores everything in RAM — perfect for development. For production, switch to a persistent backend:

Python

# pip install langgraph-checkpoint-sqlite
from langgraph.checkpoint.sqlite import SqliteSaver

memory = SqliteSaver.from_conn_string("agent_memory.db")
agent = create_react_agent(llm, tools, checkpointer=memory)

So now your agent picks up conversations where they left off, even after a restart. Different thread_id values keep separate users' conversations isolated from each other.

Step 5: Put It All Together

Here's the complete working agent in one file:

Python

from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain_community.tools.tavily_search import TavilySearchResults
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver

# --- Tools ---
search = TavilySearchResults(max_results=3)

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression and return the result."""
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {str(e)}"

# --- Agent ---
llm = ChatOpenAI(model="gpt-4o", temperature=0)
memory = MemorySaver()
agent = create_react_agent(llm, [search, calculate], checkpointer=memory)

# --- Run ---
def chat(message: str, thread_id: str = "default"):
    config = {"configurable": {"thread_id": thread_id}}
    response = agent.invoke({"messages": [("user", message)]}, config)
    return response["messages"][-1].content

if __name__ == "__main__":
    print("Agent ready. Type 'quit' to exit.\n")
    while True:
        user_input = input("You: ")
        if user_input.lower() == "quit":
            break
        print(f"Agent: {chat(user_input)}\n")

Run it:

Bash

python agent.py

Try asking things like:

"Search for the latest Python release and tell me the version number"
"What's 1847 * 23 + 99?"
"Remember: my favorite language is Rust" (then later ask "What's my favorite language?")

Developer reviewing error output while debugging an AI agent

Common Pitfalls and How to Dodge Them

After building agents for a while, certain mistakes come up over and over. Here's what to watch for:

1. Vague tool descriptions. If your tool's docstring says "does stuff," the agent won't know when to call it. Be explicit about inputs, outputs, and purpose.

2. Too many tools. LLMs get confused when given 20+ tools at once. Start with 3–5 and add more only when needed.

3. No error handling in tools. If a tool throws an unhandled exception, the entire agent crashes. Always wrap tool logic in try/except and return a meaningful error message — the agent can often recover if you tell it what went wrong.

4. Ignoring token costs. Every tool call generates extra tokens. A search tool that returns 10 full web pages will burn through your API budget fast. Limit result sizes.

5. Skipping temperature=0. For agents, you almost always want temperature=0. Higher temperatures make tool-calling unreliable — the LLM might hallucinate function names or pass malformed arguments.

The biggest mistake beginners make? Building too much, too fast. Start with one tool. Get it working perfectly. Then add another.

Testing Your Agent

Don't just eyeball the output. But do run these basic checks before calling your agent "done":

Tool selection tests — Give the agent a math question. Does it use the calculator? Give it a factual question. Does it search?
Memory tests — Send two messages in the same thread. Does the second message reference the first?
Edge cases — What happens with empty input? Malformed math expressions? Questions the search API can't answer?
Cost monitoring — Track token usage during development. LangChain's callback system lets you log every LLM call and its token count.

And if a tool keeps getting ignored, the fix is almost always the docstring. Rewrite it to be more specific about when the tool should be used.

Next Steps

You've got a working agent. Here's where to go from here:

Add more tools — File reading, database queries, API calls. Anything you can wrap in a Python function becomes a tool.
Build custom graphs — LangGraph lets you design multi-step workflows with branching logic, human-in-the-loop approval, and parallel tool execution.
Deploy it — Wrap your agent in a FastAPI server or use LangGraph Platform for managed deployment.
Add RAG — Connect a vector database like Chroma or Pinecone so your agent can search through your own documents.
Monitor in production — LangSmith (LangChain's observability platform) lets you trace every step of your agent's reasoning.

As of April 2, 2026, the LangChain ecosystem is moving fast. LangGraph gets new features monthly, and the community-maintained tool integrations keep growing. So bookmark the docs and check back often — what's possible today will look quaint in six months.

Sources

LangChain Documentation — Official guides and API reference
LangGraph GitHub Repository — Source code, examples, and release notes
LangChain GitHub Repository — Main framework repository