All posts
AI Tools 12 min read March 23, 2026

LangChain: The Developer's Framework for Building Real LLM Applications

A practical guide to LangChain — the Python framework powering thousands of production AI apps. Setup, core concepts, 5 must-know features, and when to use it vs when not to.

#LangChain#LLM#Python#AI Framework#RAG#Agents
Neel Shah Tech Lead · Senior Data Engineer · Ottawa

You have an idea for an LLM application. A chatbot that knows your company’s docs. An agent that can search the web and write reports. A pipeline that reads emails, extracts action items, and drafts replies.

You could build all of this from scratch — manage API calls, handle retries, wire together a vector store, build a memory system. Or you could use LangChain, where most of that infrastructure already exists and the community has already solved the problems you are about to hit.

LangChain is an open-source Python framework for building LLM-powered applications. It gives you standardised building blocks — models, chains, agents, memory, tools — that snap together. 131,000+ GitHub stars. MIT license. The most widely used LLM application framework in production today.

This guide gets you from zero to a working application and then covers the five features that actually matter.


Before We Start: What You Need

LangChain is a Python library. You need Python 3.9 or later.

Requirements
─────────────────────────────────────────
  Python     3.9 or later
  pip        any recent version
  API key    for your chosen LLM provider
             (OpenAI, Anthropic, etc.)
             OR Ollama running locally

That is it. No Docker, no server, no infrastructure to set up before you write your first line.

Check your Python version:

python --version
# Python 3.12.x  ← fine
# Python 3.8.x   ← upgrade needed

Installation

pip install langchain

For specific providers, install the integration package:

# OpenAI
pip install langchain-openai

# Anthropic
pip install langchain-anthropic

# Google
pip install langchain-google-genai

# Ollama (local models)
pip install langchain-ollama

You do not need all of these. Install only the provider you plan to use.


Your First LangChain Application: Step by Step

Let us build something real immediately — a simple Q&A assistant — before getting into theory.

Step 1 — Set your API key

export OPENAI_API_KEY="sk-..."

Or for Ollama (no key needed, just make sure Ollama is running):

ollama serve  # in one terminal

Step 2 — Create a chat model

from langchain.chat_models import init_chat_model

# OpenAI
model = init_chat_model("openai:gpt-4o")

# Anthropic
model = init_chat_model("anthropic:claude-3-5-sonnet-latest")

# Ollama (local)
model = init_chat_model("ollama:llama3.1:8b")

Notice: the only thing that changed between providers is the string argument. Your application code stays the same.

Step 3 — Send a message

from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage("You are a helpful assistant. Answer concisely."),
    HumanMessage("What is the capital of France?"),
]

response = model.invoke(messages)
print(response.content)
# Paris.

Step 4 — Add a prompt template

Hard-coding messages is fine for testing. For real applications, use templates:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert in {domain}. Answer concisely."),
    ("human", "{question}"),
])

chain = prompt | model

response = chain.invoke({
    "domain": "machine learning",
    "question": "What is gradient descent?",
})

print(response.content)

The | operator is how LangChain chains components. prompt | model means: run the prompt template first, feed its output into the model.


How LangChain Is Structured

Before going further, it helps to understand the pieces. LangChain is modular — you only use what you need.

LangChain Ecosystem
─────────────────────────────────────────────────────────
  langchain-core       Base interfaces and types
                       (install automatically with langchain)

  langchain            High-level chains and agents
                       (pip install langchain)

  langchain-community  Community integrations
                       (vector stores, loaders, tools)

  langchain-openai     OpenAI integration
  langchain-anthropic  Anthropic integration
  langchain-ollama     Ollama integration
  (etc.)               Install only what you need

  LangGraph            Agent orchestration (separate package)
  LangSmith            Monitoring and evaluation (cloud service)
─────────────────────────────────────────────────────────

You will not use all of this at once. Start with langchain and the provider package for your LLM. Add others as you need them.


The 5 Must-Know Features

Feature 1: Model Interoperability — Swap LLMs Without Rewriting Your App

This is the most practically useful thing LangChain does. Every LLM provider — OpenAI, Anthropic, Google, Ollama, Cohere, and dozens more — implements the same interface.

from langchain.chat_models import init_chat_model

# These all work identically in your application
model = init_chat_model("openai:gpt-4o")
model = init_chat_model("anthropic:claude-3-5-sonnet-latest")
model = init_chat_model("google_genai:gemini-2.0-flash")
model = init_chat_model("ollama:llama3.1:8b")

# The rest of your code never changes
response = model.invoke(messages)

In practice, this means:

Development  → use Ollama (free, local, no API costs)
Testing      → use the same code, swap to OpenAI
Production   → swap to whatever provider has best cost/performance
Fallback     → if one provider goes down, change one string

You are not locked in. You can test all four providers with the same 20 lines of code and pick the best one for your use case.

Why this matters: API costs, latency, rate limits, and model quality change constantly. Being able to swap providers without rewriting your application is worth more than it sounds.


Feature 2: Composable Chains With the Pipe Operator

LangChain’s core abstraction is the chain — a sequence of components where the output of one feeds the input of the next. The | operator makes this readable.

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4o")
parser = StrOutputParser()

# Simple chain: prompt → model → parse output as string
chain = prompt | model | parser

# Multi-step chain: summarise then translate
summarise_prompt = ChatPromptTemplate.from_template(
    "Summarise this in 3 bullet points: {text}"
)
translate_prompt = ChatPromptTemplate.from_template(
    "Translate this to French: {summary}"
)

full_chain = (
    summarise_prompt
    | model
    | parser
    | (lambda summary: {"summary": summary})
    | translate_prompt
    | model
    | parser
)

result = full_chain.invoke({"text": "Long article text here..."})

These chains are lazy — nothing runs until you call .invoke(). They are also streamable — call .stream() instead of .invoke() to get tokens as they arrive.

Why this matters: Complex LLM workflows (summarise → rewrite → classify → extract) become readable and maintainable code rather than nested function calls.


Feature 3: RAG Pipelines With Document Loaders and Vector Stores

LangChain has first-class support for building Retrieval-Augmented Generation systems. The pieces fit together in a predictable pattern.

RAG pipeline in LangChain
─────────────────────────────────────────────────────────

  Indexing (run once):
    Documents → Loader → Splitter → Embedder → Vector Store

  Retrieval (at query time):
    Question → Embedder → Vector Store search → Chunks → LLM → Answer
─────────────────────────────────────────────────────────

Here is a working RAG pipeline in about 30 lines:

from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain.chat_models import init_chat_model

# Load and split
loader = PyPDFLoader("your-document.pdf")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(docs)

# Embed and store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever()

# Build RAG chain
model = init_chat_model("openai:gpt-4o")
prompt = ChatPromptTemplate.from_template("""
Answer the question using only the context below.
Context: {context}
Question: {input}
""")

question_answer_chain = create_stuff_documents_chain(model, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

# Query
result = rag_chain.invoke({"input": "What is the refund policy?"})
print(result["answer"])

Swap Chroma for Pinecone, Weaviate, Qdrant, or any of 50+ supported vector stores. Swap PyPDFLoader for WebBaseLoader, NotionDBLoader, ConfluenceLoader, and so on.

Why this matters: The hardest part of RAG is not the LLM call — it is wiring together loaders, splitters, embedders, and stores. LangChain has pre-built connectors for almost every piece you will need.


Feature 4: Agents With Tool Calling

An agent is an LLM that decides what to do next. It has a set of tools — functions it can call — and it figures out which tools to use and in what order.

from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_core.prompts import ChatPromptTemplate
from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4o")
tools = [DuckDuckGoSearchRun()]

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful research assistant."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(model, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

result = executor.invoke({
    "input": "What happened in AI news this week? Give me 3 highlights."
})
print(result["output"])

The agent decides whether to search (and what to search for), how many times, and when it has enough information to answer. You do not hardcode the decision logic.

Agent execution trace (verbose=True)
──────────────────────────────────────────────
  > Thought: I need to search for recent AI news
  > Action: DuckDuckGoSearch("AI news March 2026")
  > Observation: [search results]
  > Thought: I have enough for 3 highlights
  > Final Answer: "1. ... 2. ... 3. ..."
──────────────────────────────────────────────

Why this matters: The step from “LLM that answers questions” to “LLM that takes actions” is enormous. Agents make it possible to build tools that actually do things — search, calculate, write files, call APIs — not just answer.


Feature 5: Memory and Conversation History

By default, LLMs are stateless — every call is independent. LangChain’s memory abstractions fix this.

from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4o")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}"),
])

chain = prompt | model

# Session store
store = {}
def get_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

chain_with_history = RunnableWithMessageHistory(
    chain,
    get_history,
    input_messages_key="input",
    history_messages_key="history",
)

# Conversation
config = {"configurable": {"session_id": "user-123"}}

r1 = chain_with_history.invoke({"input": "My name is Neel."}, config=config)
r2 = chain_with_history.invoke({"input": "What is my name?"}, config=config)
print(r2.content)
# Your name is Neel.

For persistent memory across restarts, swap InMemoryChatMessageHistory for RedisChatMessageHistory or SQLChatMessageHistory.

Why this matters: Almost every real chatbot needs to remember context. Without memory, the conversation restarts with every message.


The LangChain Ecosystem: What Else Is Available

LangChain is not just a library — it is a platform with two major companion tools:

LangChain ecosystem
─────────────────────────────────────────────────────────────
  LangChain (this guide)
    Python framework for building LLM applications

  LangGraph
    Agent orchestration — build multi-agent systems,
    stateful workflows, long-running agents
    pip install langgraph

  LangSmith
    Monitoring, tracing, and evaluation for LLM apps
    Cloud service with a free tier
    Gives you full traces of every LLM call
    Add with: LANGCHAIN_TRACING_V2=true LANGCHAIN_API_KEY=...

  LangChain Academy
    Free structured courses at academy.langchain.com
    Covers basic chains through advanced agentic patterns
─────────────────────────────────────────────────────────────

For production applications, LangSmith is worth setting up early. Debugging LLM pipelines without traces is painful — you cannot see what the model was given or what it returned at each step.


When to Use LangChain vs When Not To

LangChain is excellent in many situations and wrong for others. Be honest about which one you are in.

Use LangChain when:
  ✓ You need to connect multiple LLM providers
  ✓ You are building RAG over multiple document sources
  ✓ You need conversational memory across sessions
  ✓ You want agent behaviour (the LLM decides what to do)
  ✓ You want to use the ecosystem (LangGraph, LangSmith)
  ✓ You are in Python and want production-ready infrastructure

Do not use LangChain when:
  ✗ You are making a single direct API call
  ✗ You need extremely low latency (the abstractions add overhead)
  ✗ Your use case is fully covered by one provider's SDK
  ✗ You want minimal dependencies and full control
  ✗ You are not in Python (the JS version is separate and less complete)

What Is Great, What Is Good, What Still Needs Work

What is great

The ecosystem is unmatched. 131,000+ GitHub stars is not just a vanity metric — it means thousands of integrations, examples, tutorials, and Stack Overflow answers. Whatever connector you need (S3, Notion, Salesforce, Pinecone, Weaviate, dozens of LLMs), someone has already built it.

Model swapping is genuinely seamless. The init_chat_model interface is the cleanest implementation of provider-agnostic LLM calls available. You can develop on Ollama for free and deploy on OpenAI without changing application code.

LangSmith makes debugging possible. Without tracing, LLM pipeline bugs are nearly impossible to diagnose. LangSmith gives you full call traces, latency breakdowns, and evaluation tools.

What is good

Composability. The pipe operator for chains is elegant. Complex pipelines read like data flow diagrams, not callback hell.

Production tooling. LangSmith, LangGraph, and the Academy represent a serious investment in making LangChain viable for production applications, not just demos.

What still needs work

Local and offline support. The documentation and examples are heavily weighted toward cloud APIs. Ollama works well but feels like a second-class citizen compared to OpenAI in the docs.

Complexity for beginners. There are at least three different ways to do most things in LangChain, reflecting years of API evolution. For a beginner, finding the current recommended approach requires cross-referencing multiple docs pages.

Abstraction overhead. In high-throughput applications, LangChain’s abstractions add latency and memory overhead compared to direct API calls. For simple use cases, the complexity is not worth it.


Summary

LangChain is the practical choice for building LLM applications that need to be real — not just a demo, but something that works across providers, scales, and can be debugged when it breaks.

The five features worth remembering:

FeatureWhy it matters
Model interoperabilitySame code, any LLM provider
Composable chainsComplex pipelines that stay readable
RAG pipelinesDocument loaders + vector stores pre-built
Tool-calling agentsLLM decides what to do, not just what to say
Conversation memoryStateless model, stateful conversation

To get started:

pip install langchain langchain-openai

Then read the LangChain docs and check out LangChain Academy for free structured courses.

The GitHub repo is at github.com/langchain-ai/langchain. With 131,000+ stars and a thriving ecosystem, it is the foundation most production LLM apps are built on.

Frequently asked questions

What is LangChain: The Developer's Framework for Building Real LLM Applications about?

A practical guide to LangChain — the Python framework powering thousands of production AI apps. Setup, core concepts, 5 must-know features, and when to use it vs when not to.

Who should read this article?

This article is written for engineers, technical leads, and data teams working with LangChain, LLM, Python.

What can readers use from it?

Readers can use the article as a practical reference for ai tools decisions, implementation tradeoffs, and production engineering workflows.