Every AI agent needs tools. Without them, an LLM is just a text generator — it can reason but can't act. The challenge is connecting agents to external systems (databases, APIs, file systems, SaaS platforms) in a way that's standardized, secure, and reusable.
That's what MCP solves. After integrating MCP servers into production agents for procurement matching (BandiFinder), inventory management (Pellemoda), and autonomous RevOps (RevAgent), here's a deep guide to how MCP works and why it matters.
What MCP Is
Model Context Protocol (MCP) is an open standard for connecting AI applications to external systems. Think of it like USB-C for AI — a universal interface that lets any AI application connect to any data source or tool through a single protocol.
Before MCP, every integration was custom. Want your agent to query a database? Write a custom tool. Want it to read files? Another custom tool. Want it to call Slack? Another one. Each integration had its own authentication, error handling, and schema.
MCP standardizes all of this. It's supported by Claude, ChatGPT, VS Code, Cursor, and hundreds of other clients — build once, connect everywhere.
Architecture: Hosts, Clients, and Servers
MCP follows a client-server model with three participants:
┌──────────────────────────────────────────┐
│ MCP Host (AI Application) │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ MCP Client 1│ │ MCP Client 2│ │
│ └──────┬──────┘ └──────┬──────┘ │
└──────────┼────���─────────────┼────────────┘
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ MCP Server A│ │ MCP Server B│
│ (Local DB) │ │ (Remote API)│
└─────────────┘ └─────────────┘
- Host: The AI application (Claude Code, VS Code, your custom agent). It manages one or more MCP clients.
- Client: Maintains a connection to a single MCP server. The host creates one client per server.
- Server: Exposes tools, resources, and/or prompts. Can run locally (stdio) or remotely (HTTP).
Two-Layer Design
MCP consists of two layers:
- Data layer: JSON-RPC 2.0 protocol for lifecycle management, capability negotiation, and the three primitives (tools, resources, prompts)
- Transport layer: Communication mechanism — stdio for local, Streamable HTTP for remote
This separation means the same JSON-RPC messages work across all transports. Swap from local to remote without changing your tool logic.
The Three Primitives
Tools — Model-Controlled Actions
Tools are executable functions that the LLM decides when to invoke. They're the most commonly used primitive — querying databases, calling APIs, sending emails, creating tickets.
Each tool has a JSON Schema defining its inputs:
from fastmcp import FastMCP
mcp = FastMCP("Inventory")
@mcp.tool()
async def check_stock(product_id: str) -> dict:
"""Check current stock level for a product."""
stock = await db.query(
"SELECT quantity, warehouse FROM inventory WHERE product_id = $1",
product_id
)
return {"product_id": product_id, "stock": stock}
@mcp.tool()
async def reorder_stock(product_id: str, quantity: int) -> dict:
"""Place a reorder for a product. Requires approval for orders over 1000 units."""
order = await db.execute(
"INSERT INTO reorders (product_id, quantity) VALUES ($1, $2) RETURNING id",
product_id, quantity
)
return {"order_id": order["id"], "status": "placed"}The protocol handles discovery (tools/list), execution (tools/call), and dynamic updates (notifications/tools/list_changed). When your server adds or removes tools at runtime, connected clients are notified automatically.
Tool Annotations
Tools can include annotations that hint at their behavior — crucial for safety:
{
"name": "reorder_stock",
"description": "Place a reorder for a product",
"inputSchema": { ... },
"annotations": {
"readOnlyHint": false,
"destructiveHint": false,
"idempotentHint": true,
"openWorldHint": true
}
}| Annotation | Purpose |
|---|---|
readOnlyHint |
Tool only reads data, no side effects |
destructiveHint |
Tool may delete or irreversibly modify data |
idempotentHint |
Calling multiple times with same args has same effect |
openWorldHint |
Tool interacts with external entities beyond the server |
Clients use these to decide approval behavior — a readOnlyHint: true tool might auto-execute, while a destructiveHint: true tool requires explicit user confirmation.
Structured Output
Tools can return typed, validated output via outputSchema + structuredContent:
{
"name": "get_weather_data",
"inputSchema": { ... },
"outputSchema": {
"type": "object",
"properties": {
"temperature": { "type": "number" },
"conditions": { "type": "string" },
"humidity": { "type": "number" }
},
"required": ["temperature", "conditions", "humidity"]
}
}The response includes both human-readable content and machine-parseable structuredContent:
{
"content": [{ "type": "text", "text": "{\"temperature\": 22.5, ...}" }],
"structuredContent": { "temperature": 22.5, "conditions": "Partly cloudy", "humidity": 65 }
}This is essential for production agents — structured output lets downstream systems consume tool results without parsing text.
Error Handling
MCP distinguishes two error types:
- Protocol errors: JSON-RPC errors for unknown tools, malformed requests — the model can't fix these
- Tool execution errors: Returned with
isError: true— these contain actionable feedback the model can use to self-correct and retry
{
"content": [{ "type": "text", "text": "Invalid date: must be in the future. Current date is 2026-04-05." }],
"isError": true
}Clients should feed execution errors back to the LLM so it can adjust parameters and retry. Protocol errors are less recoverable.
Resources — Application-Controlled Data
Resources expose read-only data — file contents, database schemas, API documentation. Unlike tools, the application (not the model) controls when resources are accessed.
Each resource has a unique URI and MIME type:
@mcp.resource("schema://inventory")
async def get_schema() -> str:
"""Return the inventory database schema."""
return await db.query(
"SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'inventory'"
)Resources support two discovery patterns:
- Direct resources: Fixed URIs (
calendar://events/2024) - Resource templates: Parameterized URIs (
weather://forecast/{city}/{date}) with auto-completion
For Pellemoda's inventory agent, the MCP server exposes the database schema as a resource. The agent reads the schema before writing queries — dramatically reducing SQL errors.
Prompts — User-Controlled Templates
Prompts are reusable templates that structure LLM interactions. They're user-controlled — explicitly invoked, not auto-triggered.
@mcp.prompt()
def gdpr_analysis(document_type: str) -> str:
"""Structured prompt for GDPR compliance analysis."""
return f"""Analyze this {document_type} for GDPR compliance. Check for:
1. Personal data processing without legal basis
2. Missing data subject rights provisions
3. Inadequate data retention policies
4. Cross-border transfer violations
Return findings as structured JSON with severity levels."""For compliance monitoring (Holding Morelli), the MCP server ships domain-specific prompts — ensuring consistent, thorough analysis regardless of which LLM is used.
Elicitation: Interactive Workflows
One of MCP's most powerful features. Elicitation lets servers request input from users during tool execution — not just at the start.
Form Mode — Structured Data Collection
When a tool needs additional info mid-execution, it can pause and ask:
from fastmcp import FastMCP, Context
from pydantic import BaseModel
server = FastMCP("Orders")
class OrderConfirmation(BaseModel):
confirm: bool
shipping_method: str
@server.tool()
async def place_order(product_id: str, quantity: int, ctx: Context) -> str:
"""Place an order, requesting confirmation via elicitation."""
total = await calculate_total(product_id, quantity)
result = await ctx.elicit(
message=f"Confirm order: {quantity}x {product_id} for ${total}?",
schema=OrderConfirmation,
)
if result.action == "accept" and result.data:
order = await create_order(product_id, quantity, result.data.shipping_method)
return f"Order {order.id} placed with {result.data.shipping_method} shipping."
if result.action == "decline":
return "Order cancelled by user."
return "Order dismissed."The client renders a form based on the schema, the user fills it in, and the tool resumes with the response.
URL Mode — Sensitive Operations
For passwords, API keys, OAuth flows, and payments, form mode isn't secure enough (data passes through the client). URL mode redirects the user to a secure external page:
{
"mode": "url",
"elicitationId": "550e8400-e29b-41d4-a716-446655440000",
"url": "https://mcp.example.com/ui/connect-hubspot",
"message": "Please authorize HubSpot access to continue."
}The user completes the flow in their browser. Sensitive data never touches the MCP client or LLM context. The server sends a notifications/elicitation/complete when done.
This is how RevAgent handles CRM authorization — the MCP server acts as an OAuth client to HubSpot, with credentials stored server-side, never passing through the agent.
Transport: stdio vs Streamable HTTP
stdio (Local)
Client launches server as a subprocess via stdin/stdout. Best for local tools.
{
"inventory": {
"transport": "stdio",
"command": "python",
"args": ["/path/to/inventory_server.py"]
}
}Streamable HTTP (Remote)
HTTP POST for messages, optional SSE for streaming. Supports auth headers, OAuth, and multiple concurrent clients.
{
"crm": {
"transport": "http",
"url": "https://crm-mcp.revagent.it/mcp",
"headers": { "Authorization": "Bearer sk-..." }
}
}Connecting MCP to LangChain Agents
The langchain-mcp-adapters library converts MCP tools into LangChain tools:
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain.agents import create_agent
client = MultiServerMCPClient({
"inventory": {
"transport": "stdio",
"command": "python",
"args": ["/path/to/inventory_server.py"],
},
"crm": {
"transport": "http",
"url": "https://crm-mcp.revagent.it/mcp",
"headers": {"Authorization": "Bearer sk-..."},
}
})
tools = await client.get_tools() # MCP tools → LangChain tools
agent = create_agent("claude-sonnet-4-6", tools)
result = await agent.ainvoke({
"messages": [{"role": "user", "content": "What's the stock level for SKU-1234?"}]
})Interceptors for Production
MCP servers run as separate processes — they can't access your agent's runtime context. Interceptors bridge this gap:
from langchain_mcp_adapters.interceptors import MCPToolCallRequest
async def inject_user_context(request: MCPToolCallRequest, handler):
"""Inject user credentials into every MCP tool call."""
user_id = request.runtime.context.user_id
modified = request.override(args={**request.args, "user_id": user_id})
return await handler(modified)
async def require_auth(request: MCPToolCallRequest, handler):
"""Block sensitive tools if user isn't authenticated."""
state = request.runtime.state
if request.name in ["delete_record", "export_data"] and not state.get("authenticated"):
return ToolMessage(content="Authentication required.", tool_call_id=request.runtime.tool_call_id)
return await handler(request)
async def retry_with_backoff(request: MCPToolCallRequest, handler):
"""Retry failed MCP tool calls with exponential backoff."""
for attempt in range(3):
try:
return await handler(request)
except Exception as e:
if attempt == 2: raise
await asyncio.sleep(2 ** attempt)
client = MultiServerMCPClient(
{...},
tool_interceptors=[inject_user_context, require_auth, retry_with_backoff],
)Interceptors compose in "onion" order — first in list = outermost wrapper. In production I stack: [logging, auth, rate_limit, retry].
Security: What Can Go Wrong
The MCP security spec identifies several attack vectors you need to handle in production:
Token Passthrough (Don't Do This)
Never accept tokens from the MCP client and pass them directly to downstream APIs. This is explicitly forbidden — it breaks audit trails, bypasses security controls, and enables privilege chaining if a token is stolen.
Correct: MCP server manages its own tokens for downstream services. Use URL mode elicitation for OAuth flows with third-party APIs.
Confused Deputy
If your MCP server proxies to third-party APIs with a static client ID, attackers can exploit consent cookies to obtain authorization codes without user consent. Mitigation: implement per-client consent before forwarding to any third-party auth flow.
SSRF (Server-Side Request Forgery)
During OAuth metadata discovery, a malicious MCP server can provide URLs pointing to internal resources (http://169.254.169.254/ for cloud metadata). MCP clients should:
- Enforce HTTPS for all OAuth URLs
- Block private IP ranges
- Validate redirect targets
- Use egress proxies for server-side deployments
Session Hijacking
If session IDs are guessable, attackers can inject malicious payloads or impersonate users. Mitigation: use cryptographically random session IDs, bind them to user identity (not just session), and verify auth on every request.
Local Server Compromise
Local MCP servers execute with your privileges. A malicious startup command can exfiltrate SSH keys or delete files. Clients must show the exact command before execution and require explicit consent.
Real-World MCP Patterns
Pattern 1: Multi-Server Agent
RevAgent connects to 3 MCP servers simultaneously — CRM (HubSpot), Email (Gmail), and Analytics (custom). The LangChain agent reasons across all three without custom routing.
Pattern 2: Resources as Agent Context
For Pellemoda, the inventory MCP server exposes the database schema as a resource. The agent reads the schema before writing queries — a "schema → query → forecast" pipeline that's fully tool-driven.
Pattern 3: Elicitation for Human-in-the-Loop
RevAgent's CRM write operations use form mode elicitation for confirmation ("Update close date for Deal X from March to April?") and URL mode elicitation for initial HubSpot OAuth authorization.
Pattern 4: Dynamic Tool Discovery
BandiFinder's MCP server adds/removes tools based on which procurement portals are currently accessible. When a portal goes down, the server removes its tool and sends notifications/tools/list_changed — the agent adapts without errors.
Production Checklist
- Auth on every remote server: Bearer tokens or OAuth. Never expose unauthenticated.
- Tool annotations: Mark
readOnlyHint,destructiveHint— clients use these for approval UX. - Interceptors for rate limiting: One hallucinated loop can fire hundreds of tool calls.
- Structured output (
outputSchema): Don't rely on text parsing for downstream consumption. - Elicitation for sensitive ops: URL mode for OAuth/payments, form mode for confirmations.
- Error types: Return
isError: truefor recoverable errors so the LLM can self-correct. - Monitor with LangSmith: Trace every MCP tool call — what was called, with what args, what it returned.
- Scope minimization: Request minimal permissions upfront, escalate via
WWW-Authenticatechallenges. - HTTPS in production: Reject
http://URLs except for localhost in development.
Related Posts
- Building AI Agents with LangGraph: From Prototype to Production — the orchestration framework that powers the agents consuming MCP tools
- Building RAG Pipelines That Actually Work: GraphRAG vs Vector RAG — retrieval pipelines that feed context into agentic workflows
Building AI agents that need to connect to real-world systems? MCP is the standard I use across every agent I ship. Let's talk.