Protocol Quest — Master MCP & AI Tool Calling

Module 01

The Foundation

Before we can understand tool calling and MCP, we need to understand the fundamental limitation that made them necessary — and why solving it changed everything.

Brilliant but Trapped

Large Language Models are trained on enormous datasets of text — books, code, articles, conversations. Through this training they develop remarkable capabilities: reasoning, writing, analysis, translation, and more. But there's a catch that defines everything: training ends.

When training concludes, a model's knowledge is frozen. It knows about the world as it existed up to its knowledge cutoff date. Ask it about last week's stock prices? It can't help. Ask it to check if a website is currently down? It has no way to know. Ask it to send an email on your behalf? It cannot reach outside its own context window.

This is the isolation problem. A base LLM is like a brilliant analyst locked in a room with only historical documents. They can reason with extraordinary sophistication about everything they've read — but they can't pick up a phone, check today's news, or modify a spreadsheet in real time.

Early workarounds helped at the margins. Prompt engineering allowed users to provide context directly. Retrieval-Augmented Generation (RAG) automated the process of fetching relevant documents and stuffing them into the context window. These were clever solutions to a deeper architectural limitation.

Analogy

Think of a base LLM like a skilled analyst who can reason brilliantly but can't pick up a phone. They have deep knowledge from their training, but they're isolated from live systems. Tool calling gives them the phone — and eventually, an entire office of connected services.

Before Tool Calling: The Workarounds

Developers tried various approaches to work around the isolation problem. Prompt engineering involved carefully crafting instructions to coax better outputs from static knowledge. Users would paste in current data, ask for analysis, and work around the boundaries manually.

Retrieval-Augmented Generation (RAG) was a bigger step forward. It built a retrieval pipeline that could search a vector database or document store and inject relevant chunks into the model's context before asking a question. This solved the "fresh knowledge" problem partially — but only for reading, never for writing or acting. A RAG system could tell you current documentation but couldn't file a ticket, update a database, or query a live API.

These solutions had a fundamental ceiling: they were still just feeding more text to a model that could only output text.

The Breakthrough: Models That Request Actions

The insight that changed everything was subtle but profound. Instead of trying to give models all the information they might need, what if models could request the actions themselves?

Rather than knowing current weather, a model could say: "I need to call the weather API with location 'Seoul' to answer this." Rather than guessing at a database value, it could say: "I need to query the customers table for ID 42." The model doesn't execute these actions — it can't. But it can articulate them precisely enough that your application code can do the executing.

This is tool calling. And it parallels a deep principle in software engineering: do one thing well, then compose. Just as Unix philosophy builds powerful systems from small, focused utilities chained together, tool calling builds capable AI systems from focused functions that the model can compose on demand.

Unix Philosophy Parallel

The Unix philosophy says: do one thing well, compose with pipes. Tool calling applies this to AI: each tool does one thing (get weather, send email, query database), and the model composes them into complex behaviors. The result is far more powerful and maintainable than trying to encode all knowledge directly.

The Transformation

Old World

User

Asks question

LLM

Static knowledge only

Text Response

Limited to training data

New World

User

Asks question

LLM

Reasons + requests tools

Tools

APIs, databases, code

Rich Response

Grounded in real data

Knowledge Check

Module 1 · 2 questions · 100 XP available

Module 02

Tool Calling 101

The universal pattern behind tool calling is consistent across providers. Master these five steps and you understand the fundamentals of every AI tool integration.

The 5-Step Flow

Every tool calling implementation — regardless of provider — follows the same logical sequence. Understanding this flow deeply is the foundation of everything that follows.

Define Tools: You describe your functions as JSON Schema objects with names, descriptions, and parameter specifications. These definitions ship with every API request.
Send Request: The user's message plus the tool definitions go to the model. The model now knows what tools are available and when to use them.
Model Decides: The model analyzes the request and determines whether to use a tool. If yes, it returns structured JSON specifying the tool name and arguments — but it does NOT execute anything.
Execute: Your application code receives the tool call JSON, identifies the function, and runs it with the provided arguments. This is where real-world side effects happen.
Return Result: Your code takes the function's output and sends it back to the model. The model incorporates the result and either generates a final response or requests another tool call.

Interactive Flow — Click Play to Animate

Define Tools

Send Request

Model Decides

Execute

Return Result

Tool Definitions: The JSON Schema Contract

A tool definition is a contract between you and the model. It tells the model: here's a function, here's what it does, here's what arguments it takes, and here's what's required. The quality of your descriptions directly impacts the model's ability to use tools correctly.

Every tool definition has three critical parts: the name (how the model refers to the tool), the description (the most important field — it tells the model when and why to use this tool), and the parameters (a JSON Schema object specifying inputs, their types, and which are required).

python

# Tool Definition (JSON Schema)
tools = [{
    "type": "function",
    "name": "get_weather",
    "description": "Get current weather for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City name, e.g. 'Seoul'"
            }
        },
        "required": ["location"]
    }
}]

Tool Choice and the Agentic Loop

Models don't always call tools — you can configure when they're allowed to. The tool_choice parameter controls this behavior:

auto (default): The model decides whether to call a tool or respond directly.
required: The model must call at least one tool before responding.
none: The model must not call any tools, even if definitions are present.
forced: Force a specific tool to be called (useful for structured extraction).

When a model calls multiple tools in sequence, you get the agentic loop. While the model's stop reason indicates another tool call is needed, your application keeps executing tools and feeding results back. This loop is how complex multi-step AI agents are built.

Best Practices

Write clear, specific tool descriptions — this is the single most impactful thing you can do.
Include examples and edge cases in descriptions when behavior is non-obvious.
Keep descriptions concise — every character costs tokens on every API call.
Implement proper error handling and retry logic in your execution layer.
Cache tool results where appropriate to reduce latency and cost.
Add contextual guidance in the system prompt about when to use each tool.

Knowledge Check

Module 2 · 2 questions · 100 XP available

Module 03

Provider Showdown

OpenAI, Anthropic, and Google all implement tool calling — but with meaningful differences in architecture, naming conventions, and unique capabilities. Here's what you need to know about each.

OpenAI: The Mature Implementation

OpenAI's function calling (now called tool calling) is the most mature and widely referenced implementation. It introduced the pattern that others would follow and refine.

Key features: parallel function calls (the model can request multiple tools simultaneously), strict mode (strict: true enforces exact schema conformance), and flexible tool_choice options (auto, required, none, or a forced specific function). The latest models including GPT-5.4 also support tool_search for dynamic tool discovery at runtime.

Anthropic/Claude: Three Tool Categories

Claude's tool implementation is architecturally richer because it distinguishes between who executes the tool. Claude has three distinct categories:

Client tools (user-defined): You define the schema, you write the execution code, you handle the results. Standard function calling.
Anthropic-schema client tools: Tools like bash and text_editor have schemas that are trained into the model itself for higher reliability. You still execute them locally — but you use the pre-defined schemas.
Server tools: web_search, code_execution, web_fetch, and tool_search — these run on Anthropic's infrastructure. You don't execute them; Anthropic does. They can even run their own internal loops (e.g., multiple web searches before returning a result).

Claude signals tool calls via stop_reason: "tool_use" and completion via stop_reason: "end_turn". The strict: true option is also supported for schema conformance.

Google Gemini: Function Declarations

Gemini uses function_declarations inside a tools array — the flow mirrors the universal pattern with provider-specific naming. Multi-tool use allows combining built-in capabilities (like Google Search grounding) with custom function calling in the same request. Gemini 3 Flash Preview is among the latest available models.

Feature	OpenAI	Claude	Gemini
Tool Format	functions in `tools[]`	`tools[]` with `input_schema`	`functionDeclarations`
Parallel Calls	Yes	Yes	Yes
Strict Schema	`strict: true`	`strict: true`	—
Server-side Tools	—	web_search, code_execution, web_fetch, tool_search	Google Search grounding
Dynamic Discovery	tool_search (GPT-5.4+)	tool_search (server)	—
Stop Signal	`finish_reason: "tool_calls"`	`stop_reason: "tool_use"`	`finishReason: "STOP"` + `functionCall`

OpenAI — Tool Definition Pattern

python

# OpenAI tool calling
from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-4.1",
    tools=[{
        "type": "function",
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": { "type": "object",
          "properties": { "location": { "type": "string" } },
          "required": ["location"] }
    }],
    input=[{"role": "user", "content": "Weather in Seoul?"}]
)
# Check response.output for tool_call items

Anthropic/Claude — Three-Category Pattern

python

# Claude tool calling (client tool)
import anthropic
client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get current weather",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }],
    messages=[{"role": "user", "content": "Weather in Seoul?"}]
)
# response.stop_reason == "tool_use" signals a tool call
# response.stop_reason == "end_turn" signals completion

Knowledge Check

Module 3 · 2 questions · 100 XP available

Module 04

Enter MCP

Tool calling solved the isolation problem for a single AI app. But what about building a world of interconnected AI applications and tools? That requires a protocol — the Model Context Protocol.

The N×M Problem

Imagine you're building an AI-powered development environment. You want it to access GitHub, query Postgres, search documentation, run terminal commands, and check Slack. That's 5 tools. Now imagine 20 different AI applications all needing those same 5 tools — plus each needing 5 other unique ones. You're looking at potentially hundreds of custom integrations, each built differently, each maintained separately, each breaking in its own unique way.

This is the N×M problem: N AI applications times M tools equals N×M custom integrations. At any real scale, this becomes unmanageable. Teams spend more time writing glue code than building actual features.

Integration Scaling: The MCP Advantage

Without MCP

5 AI Apps × 10 Tools

50

custom integrations

With MCP

5 AI Apps + 10 Tools

15

integrations (N+M)

MCP's Origin and Governance

MCP was created by Anthropic and open-sourced in November 2024. From the beginning it was designed as an open standard — not a proprietary Anthropic technology. The spec was developed to be provider-agnostic and community-driven.

In December 2025, at its one-year anniversary, MCP was donated to the Linux Foundation under the Agentic AI Foundation (AAIF), co-founded by Anthropic, Block, and OpenAI. This move cemented MCP's position as a true industry standard, not controlled by any single vendor.

By early 2026, over 500 public MCP servers exist in the wild, with support from all three major AI providers: Anthropic, OpenAI, and Google DeepMind. Every major AI IDE has embedded MCP client support.

Key Insight

MCP is to AI tools what USB is to peripherals — a universal connector. Before USB, every peripheral needed its own port, driver, and protocol. USB created a standard that worked everywhere. MCP does the same for AI integrations: one protocol, every tool, every application.

Architecture: Host → Client → Server

MCP defines three roles in its architecture. Understanding these roles is essential to understanding how MCP works:

Host: The AI application the user interacts with directly. Examples: Claude Desktop, VS Code with Copilot, Cursor, Windsurf, or your custom agent. The host manages the LLM context window, decides when to invoke tools, and routes messages between its embedded clients.
Client: A connector that lives inside the host. Each client maintains a 1:1 connection with exactly one MCP server. A host can embed many clients — one per server it connects to.
Server: A process that exposes capabilities (tools, resources, prompts) through the MCP protocol. A GitHub MCP server exposes GitHub operations. A Postgres server exposes database queries. Each server does one thing well.

All communication uses JSON-RPC 2.0 as the wire format, which is simple, well-understood, and language-agnostic.

MCP was directly inspired by the Language Server Protocol (LSP) — the standard that transformed how IDEs provide language intelligence. What LSP did for programming languages and text editors, MCP does for AI applications and tools.

MCP Architecture

HOST

AI Application
Claude Desktop
VS Code / Cursor
Custom Agent

Manages context
Routes tool calls

CLIENT

Connector in Host
1:1 with each server

JSON-RPC 2.0

Translates requests
Manages sessions

SERVER

Exposes capabilities

Tools (executable)
Resources (data)
Prompts (templates)

GitHub, Postgres,
Slack, Kubernetes...

Knowledge Check

Module 4 · 2 questions · 100 XP available

Module 05

MCP Core Primitives

MCP defines six core primitives — three offered by servers and three offered by clients. Understanding each one and when to use it is the key to designing effective MCP integrations.

The One-Line Summary

Resources are READ. Tools are DO. Prompts are GUIDE. And on the client side: Sampling lets servers think, Roots tell servers where to operate, and Elicitation lets servers ask for input.

Server Features (Server → Client)

These three primitives are offered by MCP servers and consumed by hosts/clients.

1. Resources — Read-Only Data Exposure

Resources expose data to the AI model, similar to how an HTTP GET endpoint works. They're identified by URIs and can be either static (a fixed file) or dynamic (a database query that runs each time). Resources can be consumed by either the user or the AI model, depending on the application design.

Think of resources as the "read" side of your data: files on disk, database records, API responses cached as documents, configuration data, or documentation. A resources request never has side effects — it's always safe to call.

Examples: file:///project/README.md, postgres://db/customers/42, github://repo/anthropics/mcp/issues

2. Tools — Executable Functions with Side Effects

Tools are functions the AI model can call to take action in the world. Unlike resources, tools can modify state — they can send emails, create files, commit code, trigger deployments, or query live APIs that consume rate limit budget. The model decides when to call tools (model-controlled).

Tools are the action layer. Every tool has a name, a description (critical for model selection), and an input schema. The server executes the tool and returns a result.

Examples: create_github_issue, run_sql_query, send_slack_message, deploy_to_kubernetes

3. Prompts — Reusable Templates and Workflows

Prompts are reusable workflow templates that users can invoke. Unlike tools (which the model triggers autonomously), prompts are user-controlled — a user selects a prompt from a menu. They accept dynamic arguments, can include resource context (pulling in relevant data), and can chain multiple actions into a workflow.

Think of prompts as "smart shortcuts" that encode expert workflows. A debug_error prompt might automatically gather relevant logs, context, and file snippets before presenting them to the model for analysis.

Dimension	Resources	Tools
Access Pattern	Read-only (like HTTP GET)	Read/Write (side effects allowed)
Control	User or model controlled	Model controlled
Identification	URI-based	Function name + schema
Safety	Always safe to call	May have irreversible effects
Use Case	Files, records, documents, configs	Send email, deploy code, modify DB

Client Features (Client → Server)

These three primitives are offered by MCP clients and allow servers to reach back into the host application's capabilities.

4. Sampling — Server-Initiated LLM Interactions

Sampling is one of MCP's most powerful and unique features. It allows an MCP server to request that the host perform an LLM inference on its behalf. This enables recursive, agentic behaviors entirely on the server side.

Imagine a code analysis server that, upon receiving a tool call, wants to reason about several possible approaches before choosing one. With sampling, it can ask the host's LLM: "Given this code, what refactoring approach is most appropriate?" The host runs the inference and returns the result to the server — which then continues executing. Users must approve sampling requests for security.

5. Roots — Filesystem and URI Boundary Queries

Roots allow servers to ask clients: "What directories or URIs should I be operating within?" The client reports the root directories it has access to or that are relevant to the current session. This lets servers properly scope their file system operations and URI access, preventing them from wandering outside their intended domain.

6. Elicitation — Server Requests User Input

Elicitation allows a server to request additional information from the user mid-workflow. If a server reaches a decision point that requires human input — a confirmation, a clarification, a preference — it can ask through the client without aborting the entire operation. This enables interactive, human-in-the-loop workflows within MCP.

Knowledge Check

Module 5 · 2 questions · 100 XP available

Module 06

Transports & Security

How do hosts and servers actually communicate? What security guarantees does MCP provide? And what changed in the massive November 2025 spec update that made MCP enterprise-ready?

Two Built-In Transports

MCP defines two official transport mechanisms. Both use JSON-RPC 2.0 as the wire format — what changes is how bytes flow between host and server.

Stdio (Standard Input/Output)

The simplest transport: the host launches the server as a subprocess and communicates via stdin/stdout. The host writes JSON-RPC messages to the server's stdin; the server writes responses to its stdout.

Stdio is ideal for: local development, CLI tools, shell scripts, and any tool that runs as a local process. It requires no network, no ports, no authentication setup — the process isolation provides the security boundary. Most MCP servers for IDEs use Stdio.

Streamable HTTP

For remote servers, MCP uses HTTP POST for client-to-server messages, with optional Server-Sent Events (SSE) for streaming responses back to the client. This enables real-time streaming of long operations while keeping the standard HTTP infrastructure.

Streamable HTTP is ideal for: remote services, cloud deployments, shared team servers, and enterprise infrastructure. It requires proper authentication (see below) but works through standard firewalls and proxies.

Security Principles

MCP's security model is built on four core principles that govern how servers, clients, and hosts interact:

User Consent and Control: Users must explicitly consent to all operations. No server can take action on a user's behalf without their knowledge and approval.
Data Privacy: Data cannot be transmitted to external systems without explicit user consent. Servers should minimize what data they request and retain.
Tool Safety: Tools represent arbitrary code execution. Both hosts and users must treat tool calls with appropriate caution and review unfamiliar servers carefully.
LLM Sampling Controls: All sampling requests from servers require user approval. The user must explicitly authorize any LLM inference triggered by a server.

For remote connections, MCP uses OAuth 2.1 with PKCE (Public Key Code Exchange), added in June 2025. MCP servers are classified as OAuth Resource Servers and implement Resource Indicators (RFC 8707) to prevent token theft and ensure tokens are only usable with their intended server.

The November 2025 Milestone

The November 2025 specification update — released on MCP's one-year anniversary — transforms MCP from a developer tool into an enterprise-ready platform. It adds async execution, machine-to-machine auth, enterprise SSO integration, and more. This is the update that unlocked production deployments at scale.

The November 2025 Spec Updates

Six major extensions landed in the one-year anniversary update, each addressing a real-world production need:

1. Async Tasks

Previously, all MCP requests were synchronous — the client had to wait for a response. The Async Tasks extension allows any request to immediately return a task handle with states: working, input_required, completed, failed, and cancelled. The client can poll or be notified when the task completes. "Call now, fetch later" — essential for long-running operations like building a codebase or running a test suite.

2. Client ID Metadata Documents (CIMD)

OAuth traditionally requires clients to register with each resource server before use — a major friction point for MCP at scale. CIMD replaces this with URL-based client identity. A client publishes a metadata document at a well-known URL. Servers can fetch this document to learn about the client without requiring manual per-server registration. This makes deploying new MCP clients dramatically simpler.

3. Extensions Framework

A formal system for adding optional capabilities to the protocol. The framework defines: a lightweight registry/namespace for extensions, a capability negotiation mechanism (clients and servers agree on supported extensions), and extension settings. This is how future MCP features will be added without breaking backward compatibility.

4. Authorization Extensions

Two new authorization schemes address enterprise use cases:

M2M OAuth (SEP-1046): Machine-to-machine authentication using the OAuth client_credentials grant. Designed for cron jobs, headless automation, CI/CD pipelines, and any agent running without a human user present. No interactive login required.
Cross App Access / XAA (SEP-990): Enterprise Identity Provider (IdP) integration as central policy enforcement. Users SSO once through their enterprise IdP; the IdP then grants access to all authorized MCP servers. No more per-server consent fatigue. Perfect for organizations deploying dozens of MCP servers where every employee needs consistent, governed access.

5. URL-mode Elicitation (SEP-1036)

For sensitive flows — collecting API credentials, third-party OAuth, payment processing — directing users through an MCP client is inappropriate. URL-mode Elicitation allows servers to redirect users to a browser URL for the sensitive step. The browser handles it securely (with HTTPS, proper redirect flows, etc.) and the result is returned to the MCP flow. Credentials never pass through the MCP client.

6. Sampling with Tools (SEP-1577)

Extends the Sampling primitive so that server-initiated LLM requests can include tool definitions. This means servers can run their own complete agent loops — sampling to reason, calling tools to act, sampling again to analyze results — without needing to surface each step to the host. Enables sophisticated server-side autonomous behaviors.

Knowledge Check

Module 6 · 2 questions · 100 XP available

Module 07

The MCP Ecosystem

MCP has grown from a protocol spec into a thriving ecosystem. Understanding the landscape — what exists, who's building it, and when to use it — is essential for any production implementation.

The Scale of the Ecosystem (2026)

As of early 2026, over 500 public MCP servers exist across the ecosystem. The GitHub repository best-of-mcp-servers tracks over 370 ranked servers with a combined 380,000+ GitHub stars.

Enterprise adoption has moved beyond early adopters. Microsoft, AWS, and HashiCorp are actively developing and maintaining MCP servers for their platforms. The pattern has become standard in every major AI IDE: Cursor, VS Code (with GitHub Copilot), Claude Code, Zed, and Windsurf all embed MCP clients natively — meaning any MCP server you build works automatically in all of them.

Popular Server Categories

DevOps & Infrastructure: GitHub (code, PRs, issues), Terraform (infrastructure as code), Kubernetes (deployments, pods), ArgoCD (GitOps), Azure DevOps, AWS Billing
Observability & Monitoring: Prometheus (metrics), Datadog (APM, logs), PagerDuty (incidents)
Productivity & Collaboration: Slack (messaging), Google Drive (files), Notion (knowledge base)
Databases: PostgreSQL (relational queries), MindsDB (AI-augmented data), various SQL adapters
Browser & Web Automation: Puppeteer (headless browser), StackGen (web scraping pipelines)

MCP vs Function Calling vs REST APIs

A common source of confusion is how MCP relates to function calling and REST APIs. The key insight: they operate at different layers and are designed to be complementary, not competitive.

Aspect	REST APIs	Function Calling	MCP
Layer	Transport (HTTP)	Model capability	Application protocol (JSON-RPC)
Discovery	Manual (read docs)	Static (per-request schemas)	Dynamic (runtime)
State	Stateless	Stateless	Stateful (sessions)
Lock-in	None	Provider-specific	None (open standard)
Scale	N custom integrations	Context tax (all tools every request)	On-demand, server-side
Auth	Per-API	Your code handles	Credential isolation at server

They're Complementary

MCP servers often call REST APIs internally. Function calling can invoke MCP tools. Most production systems use all three: REST for direct API access, function calling for immediate tool integration, and MCP for scalable, persistent tool ecosystems. Choosing one doesn't mean abandoning the others.

When to Use What

1–2 APIs, simple script: Call REST directly. No abstraction needed.
3–5 tools, single LLM provider: Function calling. Straightforward and sufficient.
10+ tools, multiple LLM providers: MCP. The N×M savings become real.
Enterprise with auth requirements: MCP with XAA and M2M OAuth extensions.
IDE integration: MCP always — all major IDEs embed MCP clients natively.
Persistent server-side state: MCP (stateful sessions vs. stateless function calls).

Knowledge Check

Module 7 · 2 questions · 100 XP available

Module 08

Build Your Own

You've learned the theory. Now it's time to build. In this module, you'll write a complete MCP server in Python, configure it for Claude Desktop, and learn the practices that separate production-quality servers from toy examples.

The Official Python SDK

Anthropic maintains the official Python SDK for MCP at modelcontextprotocol/python-sdk. It handles all the protocol plumbing — JSON-RPC encoding, capability negotiation, transport management — so you can focus on your tools' business logic.

Install it with: pip install mcp

The SDK's decorator-based API makes defining tools feel natural. The @app.tool() decorator automatically reads your function's docstring as the tool description and infers the schema from type annotations.

python

from mcp.server import Server
from mcp.types import Tool, TextContent

app = Server("my-first-server")

@app.tool()
async def get_stock_price(symbol: str) -> list[TextContent]:
    """Get the current stock price for a given ticker symbol."""
    # In production, call a real API
    prices = {"AAPL": 198.50, "GOOGL": 178.25, "MSFT": 425.00}
    price = prices.get(symbol.upper(), None)
    if price:
        return [TextContent(
            type="text",
            text=f"{symbol.upper()}: ${price:.2f}"
        )]
    return [TextContent(type="text", text=f"Unknown symbol: {symbol}")]

if __name__ == "__main__":
    import asyncio
    asyncio.run(app.run_stdio())

Connecting to Claude Desktop

Claude Desktop reads a JSON configuration file to discover MCP servers. On macOS it lives at ~/Library/Application Support/Claude/claude_desktop_config.json. On Windows it's at %APPDATA%\Claude\claude_desktop_config.json.

Add your server to the mcpServers object with a name, the command to run it, and any arguments. After saving and restarting Claude Desktop, your server's tools appear automatically in every conversation.

json

{
  "mcpServers": {
    "stock-prices": {
      "command": "python",
      "args": ["path/to/server.py"]
    }
  }
}

Start Simple

Start simple. Build one server with one tool. Get it working in Claude Desktop. Test it thoroughly. Then expand. The biggest mistake new MCP developers make is building 10 tools before validating that the first one works correctly end-to-end.

Production Best Practices

Tool descriptions are your interface: Write them as if a smart colleague needs to decide when to use this tool. Be specific about when to call it, what it expects, and what it returns.
Handle errors gracefully: Return structured error information in your TextContent rather than raising uncaught exceptions. The model can recover from a described error; it can't recover from a crash.
Use type annotations: The SDK reads them to generate schemas. Annotate all parameters and return types.
Choose transport deliberately: Stdio for local, Streamable HTTP for remote. Don't use HTTP for a tool that only runs locally.
Log tool calls: Build observability in from the start. When something goes wrong in production, you need to know what the model called and what it received.

The Road Ahead

The MCP ecosystem is evolving rapidly. Discovery registries will make it easy to find and install public servers. Enterprise governance tools will provide audit trails and access controls for organizational deployments. The expanding ecosystem means that the tools you connect today will work across every MCP-compatible host as the ecosystem grows.

You're not just learning a protocol — you're positioning yourself at the frontier of how AI systems will interact with the world. Every MCP server you build is a piece of that infrastructure.

Knowledge Check

Module 8 · 2 questions · 100 XP available

Master the Protocol

Learning Path

The Foundation

Tool Calling 101

Provider Showdown

Enter MCP

MCP Core Primitives

Transports & Security

The MCP Ecosystem

Build Your Own

The Foundation

Brilliant but Trapped

Before Tool Calling: The Workarounds

The Breakthrough: Models That Request Actions

Tool Calling 101

The 5-Step Flow

Tool Definitions: The JSON Schema Contract

Tool Choice and the Agentic Loop

Best Practices

Provider Showdown

OpenAI: The Mature Implementation

Anthropic/Claude: Three Tool Categories

Google Gemini: Function Declarations

OpenAI — Tool Definition Pattern

Anthropic/Claude — Three-Category Pattern

Enter MCP

The N×M Problem

MCP's Origin and Governance

Architecture: Host → Client → Server

MCP Core Primitives

Server Features (Server → Client)

1. Resources — Read-Only Data Exposure

2. Tools — Executable Functions with Side Effects

3. Prompts — Reusable Templates and Workflows

Client Features (Client → Server)

4. Sampling — Server-Initiated LLM Interactions

5. Roots — Filesystem and URI Boundary Queries

6. Elicitation — Server Requests User Input

Transports & Security

Two Built-In Transports

Stdio (Standard Input/Output)

Streamable HTTP

Security Principles

The November 2025 Spec Updates

1. Async Tasks

2. Client ID Metadata Documents (CIMD)

3. Extensions Framework

4. Authorization Extensions

5. URL-mode Elicitation (SEP-1036)

6. Sampling with Tools (SEP-1577)

The MCP Ecosystem

The Scale of the Ecosystem (2026)

Popular Server Categories

MCP vs Function Calling vs REST APIs

When to Use What

Build Your Own

The Official Python SDK

Connecting to Claude Desktop

Production Best Practices

The Road Ahead

Achievements