MCP Deep Dive // Under the Hood

Intro Wire Message Pump Lifecycle Trace State Machine Cancellation Schema Gen Tools as Prompt Content Blocks Text Conversion Further

Beyond the Schema

The cheatsheet covers the surface: methods, schemas, primitives, what to call and what comes back. This covers what happens underneath. What the bytes look like on the wire. What the server process is actually doing while your tool runs. Why your cancel worked or didn't. How @mcp.tool() becomes a JSON Schema. How the schema becomes part of the model's prompt. The stuff that trips people up because nobody talks about it.

Scope: stdio transport (most common for local dev), FastMCP (Python) as the reference implementation, 2024-11-05 and 2025-03-26 spec versions. Where the two specs differ, called out explicitly.

How to read this Each section is independent. If you only have 10 minutes, read Wire, Message Pump, and Cancellation. If you're debugging schema inference, jump to Schema Gen. If you're writing a converter, jump to Text Conversion. The diagrams are the load-bearing part, not the prose.

Wire & Framing

MCP runs JSON-RPC 2.0 over a transport. The transport decides how messages are framed: how the reader knows where one message ends and the next begins. Getting the framing right is the difference between a working server and one that silently corrupts every other message.

stdio: line-delimited JSON

Each JSON-RPC message is a single line. One message per newline. No length prefix, no envelope, no header. The reader reads until \n, parses the accumulated bytes as JSON, dispatches.

stdio wire (bytes on the pipe)
// client → server
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{...}}\n
{"jsonrpc":"2.0","id":2,"method":"tools/list"}\n
{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{...}}\n

// server → client
{"jsonrpc":"2.0","id":1,"result":{...}}\n
{"jsonrpc":"2.0","id":2,"result":{"tools":[...]}}\n
{"jsonrpc":"2.0","id":3,"result":{"content":[...],"isError":false}}\n

Why newline framing is safe: JSON does not permit raw newlines inside strings or values. A string containing a newline must encode it as \n. So splitting on raw \n gives clean message boundaries.

The number one stdio bug Stdout is the protocol channel. Any debug print to stdout, any library that logs to stdout, any traceback to stdout, corrupts the stream. The client will try to parse your stack trace as JSON-RPC, fail, and disconnect. All logging must go to stderr. FastMCP's ctx.info() flows back as notifications/message which is the correct path.

SSE: event stream framing

The 2024-11-05 SSE transport uses two HTTP endpoints. The server maintains a long-lived GET for events flowing server-to-client. The client sends requests as POSTs to a separate endpoint. The POST response is a bare acknowledgement; the actual JSON-RPC response comes back over the open SSE stream, matched by id.

SSE event framing
// GET /sse  (server → client stream)
event: message
data: {"jsonrpc":"2.0","id":1,"result":{...}}
// blank line terminates the event

event: message
data: {"jsonrpc":"2.0","method":"notifications/progress","params":{...}}

// comments are allowed, prefixed with colon, often used as keepalive
: keepalive

// POST /messages  (client → server)
// standard HTTP POST with JSON body, 202 Accepted response
POST /messages HTTP/1.1
Content-Type: application/json

{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{...}}

The asymmetric design surprises people. You POST a request, get a 202, then wait for the SSE stream to deliver the response. Your HTTP client is NOT waiting for the response on the POST socket. This matters for client implementations: you need a correlation table keyed by request id.

Streamable HTTP (2025-03-26+)

The newer spec folds everything onto a single endpoint. Client POSTs to /mcp. The server can return one of two content types: application/json for a single immediate response, or text/event-stream if the server wants to stream events (progress, responses, notifications). Same endpoint handles both directions via session id headers for reconnection.

The gzip flush trap If you gzip the SSE stream, you must call gz.Flush() with sync-flush semantics (deflate empty block marker) after every event, then flush the HTTP writer. Without this, your event sits in the gzip buffer and the client waits forever. Full-flush (which also resets the compression dictionary) works but costs compression ratio. Sync-flush is the right primitive.

The Message Pump

Inside a working stdio server there's a specific architecture that everyone arrives at. A single reader, a dispatcher, parallel handlers, and a serialized writer. Miss any piece and you get a subtle bug.

┌──────────────────────────────────────────────────────────────────────┐
│                         MCP SERVER PROCESS                           │
├──────────────────────────────────────────────────────────────────────┤
│                                                                      │
│    stdin                                                    stdout   │
│      │                                                         ▲      │
│      ▼                                                         │      │
│   ┌────────┐    ┌──────────┐   ┌────────────┐        ┌───────────┐  │
│   │ READ   │───▶│ PARSE    │──▶│ DISPATCH   │───┐    │ ENCODE    │  │
│   │ loop   │    │ json-rpc │   │  router   │   │    │ +mutex    │  │
│   └────────┘    └──────────┘   └─────┬──────┘   │    └─────▲─────┘  │
│                                       │          │          │        │
│                                       │ notifications     │        │
│                                       │ (cancel, progress)  │        │
│                                       ▼          │          │        │
│                              ┌─────────────────┐ │          │        │
│                              │ cancelRegistry  │ │          │        │
│                              └─────────────────┘ │          │        │
│                                                  ▼          │        │
│                                       ┌─────────────────────┐        │
│                                       │ HANDLER POOL        │        │
│                                       │  ┌──────────────┐   │        │
│                                       │  │ tools/call   │──┼────────┘
│                                       │  ├──────────────┤   │
│                                       │  │ tools/call   │──┼────────┐
│                                       │  ├──────────────┤   │        │
│                                       │  │ resources/   │──┼────────▶┘
│                                       │  │  read        │   │
│                                       │  └──────────────┘   │
│                                       └─────────────────────┘
│
└──────────────────────────────────────────────────────────────────────┘

Why each piece exists

Component	Why	Bug if missing
Single reader	stdin is a byte stream. Only one goroutine / task should read it. Multiple readers = interleaved bytes, corrupted JSON.	Random parse errors
Dispatch	Tool calls can block for seconds. If the handler runs in the read loop, no other message arrives until it returns.	Cancellation deadlock, no concurrent tool calls
Handler pool	Multiple tool calls can be in flight simultaneously. Each runs in its own goroutine / task with its own context.	Serialized throughput, one slow call blocks all
cancelRegistry	Maps request id to a cancel function. The read loop uses it to cancel in-flight handlers when a notification arrives.	Cancellation silently no-ops
Write mutex	Multiple handlers may finish concurrently. Writes to stdout must be serialized so JSON objects don't interleave mid-byte.	Corrupted output stream
Notification fast path	Notifications (cancel, progress) don't go through dispatch. Read loop handles them inline or via a short-lived task.	Slow cancel response

FastMCP's version of this FastMCP uses Python's asyncio. The "single reader" is one coroutine awaiting stdin.readline(). "Dispatch" is asyncio.create_task() on the handler coroutine. The write mutex is an asyncio.Lock(). The cancelRegistry is implicit in asyncio task cancellation: when a notifications/cancelled arrives, FastMCP calls task.cancel() on the matching handler task, which raises CancelledError at the next await point inside your tool.

Request Lifecycle Trace

One tool call. From the moment the model emits it to the moment the result lands back in its context. Every step and every place it can fail.

Model emits tool_use Inside the model's response, a structured tool_use block appears: function name, arguments. The host (Claude Desktop, your IDE plugin, your custom app) intercepts the block before rendering to the user.
Host routes to matching client The host has N MCP clients (one per configured server). It looks up which client owns the tool by name. If two servers expose tools with the same name, the host has to disambiguate. Claude Desktop prefixes tool names with server name to avoid collision.
Client builds JSON-RPC envelope Wraps as {"jsonrpc":"2.0","id":N,"method":"tools/call","params":{"name":"...","arguments":{...}}}. The id is a monotonically increasing integer per client.
Client writes to transport For stdio: writes bytes + newline to the server's stdin pipe, flushes. For SSE: POSTs to /messages. For Streamable HTTP: POSTs to /mcp and starts reading the response.
Server read loop wakes The stdin read blocks until data arrives. When the client writes, the read loop wakes, accumulates bytes until the newline, parses the JSON-RPC envelope.
Dispatch decision Read loop checks the method. Is it a notification? Handle inline (e.g., notifications/cancelled). Is it a request? Build a handler task, register its cancel function in cancelRegistry[id], start the task, continue reading.
Handler parses params The handler validates the arguments against the tool's inputSchema. In FastMCP, this is the Pydantic model generated from your function signature. Invalid args = InvalidParams response (-32602).
Tool function runs Your code executes. Makes HTTP requests, queries databases, reads files. Uses the context parameter for cancellation, logging, progress reporting. This is the slow part -- anywhere from milliseconds to seconds.
Tool returns content blocks Your function returns. The handler wraps the return value in the MCP envelope: {content: [{type:"text", text:"..."}], isError: false}. Exceptions become isError:true with the exception message. Transport errors become JSON-RPC errors.
Encode + write mutex Handler acquires the write mutex, serializes the full JSON-RPC response, writes to stdout with a newline, flushes, releases the mutex, deregisters from cancelRegistry.
Client read loop matches response The client's read loop parses the response, looks up id N in its pending-request table, resolves the awaiting caller with the result. The caller is a promise / future in the host.
Host appends to conversation The host takes the result's content blocks and appends them to the model's conversation as a tool_result message. This becomes part of the context for the next turn.
Model continues reasoning Next turn: the model sees the tool_result in its context. It can now reason about the result, call more tools, respond to the user, whatever the agent loop decides.

Init State Machine

An MCP connection is a state machine with four states. Most servers don't enforce it explicitly, which means they quietly accept malformed init sequences and then break in weird ways later. Understanding the state machine lets you debug "my server responds to initialize but not to tools/list" kind of bugs.

   ┌───────────────┐
   │   CONNECTED   │     socket open, no messages exchanged yet
   └───────┬───────┘
           │
           │  client sends initialize
           ▼
   ┌───────────────┐
   │ INITIALIZING  │     server processes initialize, returns capabilities
   └───────┬───────┘     ONLY legal inbound method: initialize
           │
           │  server responds
           │  client sends notifications/initialized
           ▼
   ┌───────────────┐
   │     READY     │     all capability methods legal (tools/*, resources/*, prompts/*)
   └───────┬───────┘     notifications/*: cancel, progress, list_changed, updated
           │
           │  transport closes OR fatal error
           ▼
   ┌───────────────┐
   │   SHUTDOWN    │     connection closed, no further messages
   └───────────────┘

Legal operations by state

State	Legal inbound	Error on illegal
CONNECTED	`initialize`	-32600 invalid request
INITIALIZING	(waiting for server response)	client should not send more requests
READY	all declared capabilities + ping	-32601 method not found if tool/resource/prompt missing
SHUTDOWN	(none)	transport closed

Capability negotiation inside initialize

The initialize exchange is where client and server agree on what the server actually supports. The client advertises what it can consume (roots, sampling), the server advertises what it provides (tools, resources, prompts). Each capability can have sub-flags:

server initialize response
{
  "capabilities": {
    "tools":     { "listChanged": true },
    "resources": {
      "subscribe":    true,
      "listChanged":  true
    },
    "prompts":   { "listChanged": false },
    "logging":   {},
    "experimental": {
      "wrapster.concurrent_dispatch": true,
      "wrapster.output_handles": true
    }
  }
}

The presence of a key means "this server supports the feature." The sub-flags control optional behaviors. listChanged: true means the server will send notifications/tools/list_changed when its tool set changes. subscribe: true means resources support per-URI subscription. The experimental block is the escape hatch for non-standard extensions (which is where wrapster advertises its custom transport features).

The asymmetry that confuses people Both sides can advertise capabilities but they mean different things. Client capabilities describe what the server can ask the client to do (sampling: the server can ask the client to make an LLM call on its behalf; roots: the server can query the client's filesystem boundaries). Server capabilities describe what tools/resources/prompts the server offers. They're not symmetric.

Cancellation Mechanics

The most common question after "how does tools/call work" is "how does cancel work" because every time someone tries to cancel a long-running tool call, it does nothing and they don't know why. Here is the actual mechanism.

The full flow

CLIENT                                           SERVER
  │                                                │
  │  tools/call {id: 42}                        │
  │ ──────────────────────────────────────────────▶│
  │                                                │
  │                                                │  read loop parses, dispatches
  │                                                │  cancelRegistry[42] = cancel_fn
  │                                                │  handler task starts
  │                                                │  handler awaits HTTP call to Jira
  │                                                │
  │  user hits Escape                              │
  │                                                │
  │  notifications/cancelled {requestId: 42}         │
  │ ──────────────────────────────────────────────▶│
  │                                                │
  │                                                │  read loop sees notification
  │                                                │  looks up cancelRegistry[42]
  │                                                │  calls cancel_fn()
  │                                                │  handler's context becomes cancelled
  │                                                │  HTTP client aborts the in-flight request
  │                                                │  handler raises CancelledError
  │                                                │  handler catches, returns isError:true
  │                                                │  del cancelRegistry[42]
  │                                                │
  │  response {id: 42, error: "cancelled"}        │
  │ ◀──────────────────────────────────────────────│

Why it fails

Bug	Cause	Fix
Cancel does nothing	Handler runs in read loop, so the notification is stuck behind it in the input queue	Dispatch handlers to separate tasks/goroutines so read loop stays free
Cancel races, handler already done	Notification arrives after handler deregistered. Registry lookup fails silently.	Not actually a bug. The response was already on its way.
Tool doesn't stop	Tool code doesn't use the context for cancellation. It's doing blocking work that can't be interrupted.	Pass context through to HTTP calls, subprocess.run, database queries
Registry grows forever	Handler never deregisters (exception path, forgot finally block)	Use try/finally around registration. Memory leak if you don't.

FastMCP cancellation specifically

In FastMCP, cancellation uses asyncio task cancellation. When notifications/cancelled arrives, FastMCP calls .cancel() on the matching handler task. This raises asyncio.CancelledError at the next await point inside your handler. For this to actually terminate your tool, your tool must:

Use async libraries (aiohttp not requests, asyncpg not psycopg)
Actually hit an await point periodically. A tight CPU loop never yields and never sees the cancel.
Not swallow CancelledError in a broad except Exception. That's a bug that defeats cancellation entirely.

The swallowed cancel bug The single worst thing you can do in a tool handler is try: ... except Exception: return "failed". CancelledError inherits from BaseException in Python 3.8+, so it's NOT caught by except Exception. Good. But in older code or if you catch BaseException, you swallow the cancel and the tool runs to completion while the client thinks it was cancelled. Always re-raise CancelledError explicitly if you catch broadly.

Schema Generation

What happens between @mcp.tool() and the JSON Schema that shows up in tools/list. The pipeline has four stages and each one can surprise you.

┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐
│  Python func     │────▶│  Pydantic model  │────▶│  JSON Schema     │────▶│  Tool envelope   │
│                  │     │                  │     │                  │     │                  │
│ type hints       │     │ field defs       │     │ properties       │     │ name             │
│ defaults         │     │ validators       │     │ required         │     │ description      │
│ docstring        │     │ descriptions     │     │ types            │     │ inputSchema      │
└──────────────────┘     └──────────────────┘     └──────────────────┘     └──────────────────┘
    inspect                  create_model()            model_json_schema()         wrap as MCP tool
    docstring_parser         Field(..., description)    $defs resolution

Stage 1: Python function inspection

source
@mcp.tool()
def search_jira(
    project: str,
    jql: str,
    max_results: int = 50,
    include_subtasks: bool = False,
) -> list[dict]:
    """Search Jira issues using JQL.

    Args:
        project: Jira project key, e.g. "PROJ"
        jql: JQL query string
        max_results: Maximum number of results
        include_subtasks: Whether to include subtask issues

    Returns:
        List of issue dicts with key, summary, status
    """
    ...

FastMCP calls inspect.signature() on your function. Extracts parameter names, type annotations, default values. Parses the docstring via docstring-parser (supports Google, NumPy, Sphinx styles) to pull per-parameter descriptions and the top-level summary.

Stage 2: Pydantic model synthesis

FastMCP dynamically builds a Pydantic model from the inspected signature. Roughly equivalent to:

generated model
class SearchJiraInput(BaseModel):
    project: str = Field(..., description="Jira project key, e.g. PROJ")
    jql: str = Field(..., description="JQL query string")
    max_results: int = Field(50, description="Maximum number of results")
    include_subtasks: bool = Field(False, description="Whether to include subtask issues")

Stage 3: JSON Schema generation

Pydantic's model_json_schema() walks the model and emits draft-07 JSON Schema. Primitive types map directly. Optional[X] becomes {type: [X, "null"]} or excluded from required. Literal["a", "b"] becomes {enum: ["a", "b"]}. Nested Pydantic models become $ref to a $defs entry.

resulting schema
{
  "type": "object",
  "properties": {
    "project": {
      "type":        "string",
      "description": "Jira project key, e.g. PROJ",
      "title":       "Project"
    },
    "jql": {
      "type":        "string",
      "description": "JQL query string",
      "title":       "Jql"
    },
    "max_results": {
      "type":        "integer",
      "description": "Maximum number of results",
      "default":     50,
      "title":       "Max Results"
    },
    "include_subtasks": {
      "type":        "boolean",
      "description": "Whether to include subtask issues",
      "default":     false,
      "title":       "Include Subtasks"
    }
  },
  "required": ["project", "jql"],
  "title": "SearchJiraInputSchema"
}

The title noise problem Pydantic adds title fields to every property and to the root. The model doesn't use these and they waste tokens in the tool definition that gets rendered into the system prompt. At 30 tools with 5 parameters each, you're spending 150 title fields of noise per session. FastMCP has options to strip them; worth enabling in production.

Stage 4: Tool envelope

Finally FastMCP wraps the schema with name and description to form the tool definition returned by tools/list. The tool name comes from the function name (or an explicit override). The description comes from the first line of the docstring (the summary), not the full docstring.

Things the inference does poorly

Python construct	What you get	Why it's bad
`Union[A, B]`	`anyOf: [{$ref: A}, {$ref: B}]`	Models struggle to decide which branch to construct
Nested models	`$ref` into `$defs`	Some clients don't resolve refs, some strip them
Recursive types	self-referential `$ref`	Infinite expansion in naive renderers
Custom validators	not reflected in schema	Validation happens at runtime, model can't see the constraint
Sphinx docstrings	silently skipped	docstring-parser expects Google/NumPy style by default

For tools called frequently or where schema clarity matters, bypass inference and write the inputSchema by hand. FastMCP accepts an explicit schema parameter on the decorator.

Tools as System Prompt

Here is the thing nobody tells you. When a client calls tools/list and gets back your tool definitions, the host doesn't just hand them to the model as "available functions." The host converts them into text and prepends that text to the model's system prompt. Your tool definitions ARE the prompt. Every tool description, every property description, every enum value is load-bearing prompt engineering.

What the model actually sees

The exact format depends on the host and the model, but the structure is approximately:

approximate system prompt injection
You have access to the following tools. Use them when appropriate
to help the user. Each tool has a name, a description, and
parameters you must provide.

<tool>
<name>search_jira</name>
<description>Search Jira issues using JQL.</description>
<parameters>
  project (string, required): Jira project key, e.g. PROJ
  jql (string, required): JQL query string
  max_results (integer, optional, default 50): Maximum number of results
  include_subtasks (boolean, optional, default false): Whether to include subtask issues
</parameters>
</tool>

<tool>
<name>get_issue</name>
<description>Retrieve a single Jira issue by key.</description>
<parameters>
  ...
</parameters>
</tool>

... (repeat for all N tools) ...

This block is prepended to EVERY turn in the conversation. Not once at session start. Every single turn. The cost compounds.

Token cost estimation

Rough math. A tool definition with a 2-sentence description and 5 parameters with moderate descriptions is about 150-250 tokens. For a moderate MCP server:

Tool count	Avg tokens/tool	Tools block size	10-turn conversation cost
10	200	2,000 tok	20,000 tok
20	200	4,000 tok	40,000 tok
30	250	7,500 tok	75,000 tok
50	250	12,500 tok	125,000 tok

For an Atlassian server with a rich API surface (Jira + Confluence + user management + attachments), 30+ tools is realistic. That's a nontrivial chunk of the model's context window spent on tool definitions before any actual work happens.

Implications for writing tools

Descriptions are prompts. Write "Use this when the user wants to..." not "This function does..."
Tool names matter. The model uses them as function names. search_jira vs query_atlassian_issues changes how the model thinks about the tool.
Property descriptions explain semantics, not types. "Jira project key, e.g. PROJ" not "A string value."
Enums are self-documenting. enum: ["open", "in_progress", "closed"] tells the model the legal values without English explanation.
Required fields force the model's hand. Required = "must construct." Optional = "include if relevant."
Audit for bloat. Every word in a description costs tokens every turn. Long Pydantic Field descriptions that nobody needed add up fast.

The path forward: dynamic tool filtering Some hosts (and MCP spec work) are moving toward per-turn tool filtering. The model picks a category first, and only the relevant tools are injected for that turn. For an Atlassian server, this could mean "Jira mode" vs "Confluence mode" with different tool subsets. Not universally supported yet but worth watching -- it's the only real fix for the tool-count-vs-context-cost tradeoff.

Content Block Anatomy

The return value of a tool call is {content: [ContentBlock], isError: bool}. That content array is more interesting than people realize. You can return multiple blocks, mix types, and use the resource block type to avoid inlining content at all.

The four types

text block
{
  "type": "text",
  "text": "Hello world. This is plain text that lands in the context."
}

image block
{
  "type":     "image",
  "data":     "iVBORw0KGgoAAAANSUhEUgAA...",  // base64-encoded bytes
  "mimeType": "image/png"
}

resource block (embedded)
{
  "type": "resource",
  "resource": {
    "uri":      "confluence://TEAM/architecture-overview",
    "mimeType": "text/markdown",
    "text":     "# Architecture Overview\n\n..."
  }
}

audio block (2025-03-26+)
{
  "type":     "audio",
  "data":     "UklGRiQAAABXQVZF...",
  "mimeType": "audio/wav"
}

Size tradeoffs

Block type	Wire overhead	Context cost	Notes
text	minimal (json escape)	~1 token per 4 chars	Best for small-to-medium content. 40KB = ~10K tokens.
image	+33% (base64)	model-specific vision tokens	Claude vision: ~1.5k tokens per 1024x1024 image regardless of file size
resource (inline)	minimal	same as text	Same cost as text but semantically addressable by uri
resource (reference)	near zero	near zero until fetched	Return just the uri, client fetches on demand via resources/read

The multi-block pattern

You can return multiple blocks in one tool result. This is underused. Example: a Jira search tool that returns a summary text block plus individual resource references for each matching issue.

multi-block jira search result
{
  "content": [
    {
      "type": "text",
      "text": "Found 3 matching issues:"
    },
    {
      "type": "resource",
      "resource": {
        "uri":  "jira://PROJ-123",
        "text": "PROJ-123: Login button broken - Status: Open - Assignee: Alice"
      }
    },
    {
      "type": "resource",
      "resource": {
        "uri":  "jira://PROJ-124",
        "text": "PROJ-124: Session expires - Status: In Progress - Assignee: Bob"
      }
    },
    {
      "type": "resource",
      "resource": {
        "uri":  "jira://PROJ-125",
        "text": "PROJ-125: 2FA errors - Status: Open - Assignee: Alice"
      }
    }
  ],
  "isError": false
}

Advantages of this shape: the model sees a compact summary plus individually addressable items. If it wants full details for one issue, it can call resources/read with that uri instead of the server having pre-inlined 50KB of issue data for every match.

Text Conversion (ADF · HTML · MD)

Atlassian is a useful case study here because the formats in play are a microcosm of every rich-text integration problem: a native tree structure, a legacy markup variant, and Markdown as the model's lingua franca. Round-tripping between them is harder than it looks because each format has features the others don't.

The formats

Format	Used by	Shape	Complexity
Jira REST v2	Jira Server / Data Center	Plain strings, wiki markup in description/comment bodies	Flat: bolded fields, minimal structure
Storage Format	Confluence Server / Data Center (and Cloud) page bodies	XHTML with ac: and ri: namespaces	Custom macros, layout cells, structured data
Wiki Markup	Legacy Confluence, Jira v2 rich text fields	Confluence wiki syntax (`h1.`, `{code}`, etc)	Still the common case on Server/DC
ADF	Jira Cloud (REST v3), Confluence Cloud (newer)	JSON tree of typed nodes	Rich: marks, tables, layouts, extensions
Markdown	Model input/output	CommonMark + GFM extensions	Lowest common denominator

Server/DC vs Cloud note. Jira Server and Data Center use REST API v2 and never emit ADF. Description and comment bodies come back either as plain text or as Confluence-style wiki markup, depending on the renderer query. ADF is Cloud-only and only on the v3 endpoint. The tree-walk pattern below still applies though: storage format is also a tree (XHTML), wiki markup parses into one, and all three formats hit the same class of round-trip hazards.

ADF as a tree

adf document
{
  "type": "doc",
  "version": 1,
  "content": [
    {
      "type": "paragraph",
      "content": [
        { "type": "text", "text": "Hello " },
        {
          "type": "text",
          "text": "world",
          "marks": [{ "type": "strong" }]
        }
      ]
    },
    {
      "type": "bulletList",
      "content": [
        {
          "type": "listItem",
          "content": [
            { "type": "paragraph", "content": [{ "type": "text", "text": "first" }] }
          ]
        }
      ]
    },
    {
      "type":  "codeBlock",
      "attrs": { "language": "python" },
      "content": [{ "type": "text", "text": "print('hi')" }]
    }
  ]
}

Conversion is a tree walk. For each node type, emit the Markdown equivalent. Text nodes with marks become inline formatting. Lists and code blocks become block-level constructs. The algorithm is straightforward until you hit the edge cases.

Hard cases that break naive converters

Case	Why it breaks	Mitigation
Nested marks (bold + italic + link)	Markdown has limited nesting rules. `_text_` works in CommonMark but some parsers fail.	Emit strict order: link outermost, then strong, then em
Tables with block content	GFM tables don't support paragraphs or lists inside cells	Flatten cell content to single-line, or emit as HTML table
Code containing backticks	Single-backtick inline code can't contain backticks	Use variable-length fencing: wrap in enough backticks to avoid collision
Confluence panels (info, warning)	No Markdown equivalent	Emit blockquote with prefix: `> [INFO] ...`
Mentions (@user)	ADF has structured mention nodes with accountId	Resolve via user cache, emit `@displayName`
Attachments / embedded media	ri:attachment references are context-sensitive	Resolve to absolute URL or keep as MCP resource URI
Layout cells (multi-column)	Markdown is single-column	Drop layout, concatenate content vertically
Status pills, dates, emojis	Structured nodes with attrs, no Markdown form	Render as text with emoji or bracketed label

The round-trip problem

You have two conversion directions:

read path (server → model)
Atlassian API  →  ADF / Storage  →  Markdown  →  model context

// This direction is lossy on purpose. Drop features the model can't use.
// Information loss is fine: the model never writes them back.

write path (model → server)
model output  →  Markdown  →  ADF / Storage  →  Atlassian API

// This direction synthesizes structure the model never saw.
// You're upgrading MD to ADF. Only the features in MD are preserved.

The problem is not MD → ADF → MD (stable, lossless for the MD subset). The problem is ADF → MD → ADF, which is lossy. If the model reads a page with panels and layouts, edits one paragraph, and writes it back, the panels and layouts are GONE.

Strategies for the round-trip loss (1) Read-only for rich features. Convert ADF → MD on read, but for writes require the model to use a patch tool that updates specific sections rather than rewriting the whole page. The rich features stay because you never touched them. (2) Extended MD syntax. Invent custom syntax for panels (::: info) and round-trip it. Works if you control both sides. (3) Preserve-and-restore. On read, strip non-MD features and store a sidecar diff. On write, re-apply the sidecar. Complex but lossless.

Debugging approach for conversion bugs

Capture input and output verbatim. Log the exact ADF JSON you received and the exact Markdown you produced. Conversion bugs hide when you're comparing mental models instead of actual bytes.
Build a minimal reproduction. Strip the ADF tree down until you find the smallest subtree that breaks. Almost always a single node type with specific attrs.
Check for encoding issues. Special chars in text nodes must be escaped for Markdown: *, _, [, ], backticks. Many converters forget.
Check whitespace normalization. ADF preserves whitespace inside text nodes literally. Markdown collapses spaces. Newlines in ADF become paragraph breaks in MD only if nested in paragraph nodes.
Test round-trip on the MD subset. Build an ADF tree that contains only MD-expressible features. Convert to MD, convert back, diff. If this round-trip fails, your converter has a bug in the MD-native path, not in the edge cases.

Further Rabbit Holes

Things to read and experiment with to keep deepening. Each one leads to better questions.

Topic	Why it matters	Where to start
FastMCP source	Understand the middleware pipeline, context object, schema generation internals	`mcp/server/fastmcp/server.py` in the python-sdk repo
JSON-RPC batching	The spec allows batched requests. Most MCP servers don't handle them. Worth knowing what breaks.	JSON-RPC 2.0 spec, batch section
Resource subscriptions	The underused primitive. Server pushes updates when a resource changes. Critical for live data.	MCP spec resources section, implement one in wrapster
Sampling (server→client LLM calls)	Reverse direction: server asks client to do an LLM inference. Enables nested agent patterns.	MCP spec sampling section
Roots	Client advertises filesystem/workspace boundaries to the server. Constrains what a filesystem server can touch.	MCP spec roots section
Streamable HTTP auth (OAuth 2.1)	The 2025-03-26 transport adds proper auth. Essential for hosted MCP servers.	MCP spec auth section
Claude Desktop config internals	How tools are rendered, how prompts surface as slash commands, per-server tool filtering	claude_desktop_config.json + Desktop logs
The _meta field	Every request can carry a _meta object. progressToken lives here. Custom metadata lives here. Underutilized.	Grep the spec for "_meta"
Confluence storage format reference	The XHTML-with-namespaces format used by Server, DC, and Cloud page bodies. Tree walk target for most Confluence MCP tools.	confluence.atlassian.com/doc/confluence-storage-format
Confluence storage format	The XHTML dialect Confluence actually stores. Macros, layouts, attachments.	confluence.atlassian.com storage format documentation

The meta-question to hold Every time MCP feels magical, ask: what bytes went over the wire, and what state changed where. MCP is a thin protocol. There's no hidden layer. Every feature decomposes into JSON-RPC messages over a pipe. If you can't trace it to bytes, you don't understand it yet.

// wire · pump · lifecycle · state · cancel · schema · prompt · content · convert //