Beyond the Schema
The cheatsheet covers the surface: methods, schemas, primitives, what to call and what comes back. This covers what happens underneath. What the bytes look like on the wire. What the server process is actually doing while your tool runs. Why your cancel worked or didn't. How @mcp.tool() becomes a JSON Schema. How the schema becomes part of the model's prompt. The stuff that trips people up because nobody talks about it.
Scope: stdio transport (most common for local dev), FastMCP (Python) as the reference implementation, 2024-11-05 and 2025-03-26 spec versions. Where the two specs differ, called out explicitly.
How to read this
Each section is independent. If you only have 10 minutes, read Wire, Message Pump, and Cancellation. If you're debugging schema inference, jump to Schema Gen. If you're writing a converter, jump to Text Conversion. The diagrams are the load-bearing part, not the prose.
Wire & Framing
MCP runs JSON-RPC 2.0 over a transport. The transport decides how messages are framed: how the reader knows where one message ends and the next begins. Getting the framing right is the difference between a working server and one that silently corrupts every other message.
stdio: line-delimited JSON
Each JSON-RPC message is a single line. One message per newline. No length prefix, no envelope, no header. The reader reads until \n, parses the accumulated bytes as JSON, dispatches.
stdio wire (bytes on the pipe)
// client → server
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{...}}\n
{"jsonrpc":"2.0","id":2,"method":"tools/list"}\n
{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{...}}\n
// server → client
{"jsonrpc":"2.0","id":1,"result":{...}}\n
{"jsonrpc":"2.0","id":2,"result":{"tools":[...]}}\n
{"jsonrpc":"2.0","id":3,"result":{"content":[...],"isError":false}}\n
Why newline framing is safe: JSON does not permit raw newlines inside strings or values. A string containing a newline must encode it as \n. So splitting on raw \n gives clean message boundaries.
The number one stdio bug
Stdout is the protocol channel. Any debug print to stdout, any library that logs to stdout, any traceback to stdout, corrupts the stream. The client will try to parse your stack trace as JSON-RPC, fail, and disconnect. All logging must go to stderr. FastMCP's ctx.info() flows back as notifications/message which is the correct path.
SSE: event stream framing
The 2024-11-05 SSE transport uses two HTTP endpoints. The server maintains a long-lived GET for events flowing server-to-client. The client sends requests as POSTs to a separate endpoint. The POST response is a bare acknowledgement; the actual JSON-RPC response comes back over the open SSE stream, matched by id.
SSE event framing
// GET /sse (server → client stream)
event: message
data: {"jsonrpc":"2.0","id":1,"result":{...}}
// blank line terminates the event
event: message
data: {"jsonrpc":"2.0","method":"notifications/progress","params":{...}}
// comments are allowed, prefixed with colon, often used as keepalive
: keepalive
// POST /messages (client → server)
// standard HTTP POST with JSON body, 202 Accepted response
POST /messages HTTP/1.1
Content-Type: application/json
{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{...}}
The asymmetric design surprises people. You POST a request, get a 202, then wait for the SSE stream to deliver the response. Your HTTP client is NOT waiting for the response on the POST socket. This matters for client implementations: you need a correlation table keyed by request id.
Streamable HTTP (2025-03-26+)
The newer spec folds everything onto a single endpoint. Client POSTs to /mcp. The server can return one of two content types: application/json for a single immediate response, or text/event-stream if the server wants to stream events (progress, responses, notifications). Same endpoint handles both directions via session id headers for reconnection.
The gzip flush trap
If you gzip the SSE stream, you must call gz.Flush() with sync-flush semantics (deflate empty block marker) after every event, then flush the HTTP writer. Without this, your event sits in the gzip buffer and the client waits forever. Full-flush (which also resets the compression dictionary) works but costs compression ratio. Sync-flush is the right primitive.
The Message Pump
Inside a working stdio server there's a specific architecture that everyone arrives at. A single reader, a dispatcher, parallel handlers, and a serialized writer. Miss any piece and you get a subtle bug.
┌──────────────────────────────────────────────────────────────────────┐
│ MCP SERVER PROCESS │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ stdin stdout │
│ │ ▲ │
│ ▼ │ │
│ ┌────────┐ ┌──────────┐ ┌────────────┐ ┌───────────┐ │
│ │ READ │───▶│ PARSE │──▶│ DISPATCH │───┐ │ ENCODE │ │
│ │ loop │ │ json-rpc │ │ router │ │ │ +mutex │ │
│ └────────┘ └──────────┘ └─────┬──────┘ │ └─────▲─────┘ │
│ │ │ │ │
│ │ notifications │ │
│ │ (cancel, progress) │ │
│ ▼ │ │ │
│ ┌─────────────────┐ │ │ │
│ │ cancelRegistry │ │ │ │
│ └─────────────────┘ │ │ │
│ ▼ │ │
│ ┌─────────────────────┐ │
│ │ HANDLER POOL │ │
│ │ ┌──────────────┐ │ │
│ │ │ tools/call │──┼────────┘
│ │ ├──────────────┤ │
│ │ │ tools/call │──┼────────┐
│ │ ├──────────────┤ │ │
│ │ │ resources/ │──┼────────▶┘
│ │ │ read │ │
│ │ └──────────────┘ │
│ └─────────────────────┘
│
└──────────────────────────────────────────────────────────────────────┘
Why each piece exists
| Component | Why | Bug if missing |
| Single reader |
stdin is a byte stream. Only one goroutine / task should read it. Multiple readers = interleaved bytes, corrupted JSON. |
Random parse errors |
| Dispatch |
Tool calls can block for seconds. If the handler runs in the read loop, no other message arrives until it returns. |
Cancellation deadlock, no concurrent tool calls |
| Handler pool |
Multiple tool calls can be in flight simultaneously. Each runs in its own goroutine / task with its own context. |
Serialized throughput, one slow call blocks all |
| cancelRegistry |
Maps request id to a cancel function. The read loop uses it to cancel in-flight handlers when a notification arrives. |
Cancellation silently no-ops |
| Write mutex |
Multiple handlers may finish concurrently. Writes to stdout must be serialized so JSON objects don't interleave mid-byte. |
Corrupted output stream |
| Notification fast path |
Notifications (cancel, progress) don't go through dispatch. Read loop handles them inline or via a short-lived task. |
Slow cancel response |
FastMCP's version of this
FastMCP uses Python's asyncio. The "single reader" is one coroutine awaiting stdin.readline(). "Dispatch" is asyncio.create_task() on the handler coroutine. The write mutex is an asyncio.Lock(). The cancelRegistry is implicit in asyncio task cancellation: when a notifications/cancelled arrives, FastMCP calls task.cancel() on the matching handler task, which raises CancelledError at the next await point inside your tool.
Request Lifecycle Trace
One tool call. From the moment the model emits it to the moment the result lands back in its context. Every step and every place it can fail.
-
Model emits tool_use
Inside the model's response, a structured tool_use block appears: function name, arguments. The host (Claude Desktop, your IDE plugin, your custom app) intercepts the block before rendering to the user.
-
Host routes to matching client
The host has N MCP clients (one per configured server). It looks up which client owns the tool by name. If two servers expose tools with the same name, the host has to disambiguate. Claude Desktop prefixes tool names with server name to avoid collision.
-
Client builds JSON-RPC envelope
Wraps as
{"jsonrpc":"2.0","id":N,"method":"tools/call","params":{"name":"...","arguments":{...}}}. The id is a monotonically increasing integer per client.
-
Client writes to transport
For stdio: writes bytes + newline to the server's stdin pipe, flushes. For SSE: POSTs to
/messages. For Streamable HTTP: POSTs to /mcp and starts reading the response.
-
Server read loop wakes
The stdin read blocks until data arrives. When the client writes, the read loop wakes, accumulates bytes until the newline, parses the JSON-RPC envelope.
-
Dispatch decision
Read loop checks the method. Is it a notification? Handle inline (e.g.,
notifications/cancelled). Is it a request? Build a handler task, register its cancel function in cancelRegistry[id], start the task, continue reading.
-
Handler parses params
The handler validates the arguments against the tool's inputSchema. In FastMCP, this is the Pydantic model generated from your function signature. Invalid args = InvalidParams response (-32602).
-
Tool function runs
Your code executes. Makes HTTP requests, queries databases, reads files. Uses the context parameter for cancellation, logging, progress reporting. This is the slow part -- anywhere from milliseconds to seconds.
-
Tool returns content blocks
Your function returns. The handler wraps the return value in the MCP envelope:
{content: [{type:"text", text:"..."}], isError: false}. Exceptions become isError:true with the exception message. Transport errors become JSON-RPC errors.
-
Encode + write mutex
Handler acquires the write mutex, serializes the full JSON-RPC response, writes to stdout with a newline, flushes, releases the mutex, deregisters from cancelRegistry.
-
Client read loop matches response
The client's read loop parses the response, looks up id N in its pending-request table, resolves the awaiting caller with the result. The caller is a promise / future in the host.
-
Host appends to conversation
The host takes the result's content blocks and appends them to the model's conversation as a tool_result message. This becomes part of the context for the next turn.
-
Model continues reasoning
Next turn: the model sees the tool_result in its context. It can now reason about the result, call more tools, respond to the user, whatever the agent loop decides.
Init State Machine
An MCP connection is a state machine with four states. Most servers don't enforce it explicitly, which means they quietly accept malformed init sequences and then break in weird ways later. Understanding the state machine lets you debug "my server responds to initialize but not to tools/list" kind of bugs.
┌───────────────┐
│ CONNECTED │ socket open, no messages exchanged yet
└───────┬───────┘
│
│ client sends initialize
▼
┌───────────────┐
│ INITIALIZING │ server processes initialize, returns capabilities
└───────┬───────┘ ONLY legal inbound method: initialize
│
│ server responds
│ client sends notifications/initialized
▼
┌───────────────┐
│ READY │ all capability methods legal (tools/*, resources/*, prompts/*)
└───────┬───────┘ notifications/*: cancel, progress, list_changed, updated
│
│ transport closes OR fatal error
▼
┌───────────────┐
│ SHUTDOWN │ connection closed, no further messages
└───────────────┘
Legal operations by state
| State | Legal inbound | Error on illegal |
| CONNECTED |
initialize |
-32600 invalid request |
| INITIALIZING |
(waiting for server response) |
client should not send more requests |
| READY |
all declared capabilities + ping |
-32601 method not found if tool/resource/prompt missing |
| SHUTDOWN |
(none) |
transport closed |
Capability negotiation inside initialize
The initialize exchange is where client and server agree on what the server actually supports. The client advertises what it can consume (roots, sampling), the server advertises what it provides (tools, resources, prompts). Each capability can have sub-flags:
server initialize response
{
"capabilities": {
"tools": { "listChanged": true },
"resources": {
"subscribe": true,
"listChanged": true
},
"prompts": { "listChanged": false },
"logging": {},
"experimental": {
"wrapster.concurrent_dispatch": true,
"wrapster.output_handles": true
}
}
}
The presence of a key means "this server supports the feature." The sub-flags control optional behaviors. listChanged: true means the server will send notifications/tools/list_changed when its tool set changes. subscribe: true means resources support per-URI subscription. The experimental block is the escape hatch for non-standard extensions (which is where wrapster advertises its custom transport features).
The asymmetry that confuses people
Both sides can advertise capabilities but they mean different things. Client capabilities describe what the server can ask the client to do (sampling: the server can ask the client to make an LLM call on its behalf; roots: the server can query the client's filesystem boundaries). Server capabilities describe what tools/resources/prompts the server offers. They're not symmetric.
Cancellation Mechanics
The most common question after "how does tools/call work" is "how does cancel work" because every time someone tries to cancel a long-running tool call, it does nothing and they don't know why. Here is the actual mechanism.
The full flow
CLIENT SERVER
│ │
│ tools/call {id: 42} │
│ ──────────────────────────────────────────────▶│
│ │
│ │ read loop parses, dispatches
│ │ cancelRegistry[42] = cancel_fn
│ │ handler task starts
│ │ handler awaits HTTP call to Jira
│ │
│ user hits Escape │
│ │
│ notifications/cancelled {requestId: 42} │
│ ──────────────────────────────────────────────▶│
│ │
│ │ read loop sees notification
│ │ looks up cancelRegistry[42]
│ │ calls cancel_fn()
│ │ handler's context becomes cancelled
│ │ HTTP client aborts the in-flight request
│ │ handler raises CancelledError
│ │ handler catches, returns isError:true
│ │ del cancelRegistry[42]
│ │
│ response {id: 42, error: "cancelled"} │
│ ◀──────────────────────────────────────────────│
Why it fails
| Bug | Cause | Fix |
| Cancel does nothing |
Handler runs in read loop, so the notification is stuck behind it in the input queue |
Dispatch handlers to separate tasks/goroutines so read loop stays free |
| Cancel races, handler already done |
Notification arrives after handler deregistered. Registry lookup fails silently. |
Not actually a bug. The response was already on its way. |
| Tool doesn't stop |
Tool code doesn't use the context for cancellation. It's doing blocking work that can't be interrupted. |
Pass context through to HTTP calls, subprocess.run, database queries |
| Registry grows forever |
Handler never deregisters (exception path, forgot finally block) |
Use try/finally around registration. Memory leak if you don't. |
FastMCP cancellation specifically
In FastMCP, cancellation uses asyncio task cancellation. When notifications/cancelled arrives, FastMCP calls .cancel() on the matching handler task. This raises asyncio.CancelledError at the next await point inside your handler. For this to actually terminate your tool, your tool must:
- Use async libraries (aiohttp not requests, asyncpg not psycopg)
- Actually hit an await point periodically. A tight CPU loop never yields and never sees the cancel.
- Not swallow CancelledError in a broad
except Exception. That's a bug that defeats cancellation entirely.
The swallowed cancel bug
The single worst thing you can do in a tool handler is try: ... except Exception: return "failed". CancelledError inherits from BaseException in Python 3.8+, so it's NOT caught by except Exception. Good. But in older code or if you catch BaseException, you swallow the cancel and the tool runs to completion while the client thinks it was cancelled. Always re-raise CancelledError explicitly if you catch broadly.
Schema Generation
What happens between @mcp.tool() and the JSON Schema that shows up in tools/list. The pipeline has four stages and each one can surprise you.
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Python func │────▶│ Pydantic model │────▶│ JSON Schema │────▶│ Tool envelope │
│ │ │ │ │ │ │ │
│ type hints │ │ field defs │ │ properties │ │ name │
│ defaults │ │ validators │ │ required │ │ description │
│ docstring │ │ descriptions │ │ types │ │ inputSchema │
└──────────────────┘ └──────────────────┘ └──────────────────┘ └──────────────────┘
inspect create_model() model_json_schema() wrap as MCP tool
docstring_parser Field(..., description) $defs resolution
Stage 1: Python function inspection
source
@mcp.tool()
def search_jira(
project: str,
jql: str,
max_results: int = 50,
include_subtasks: bool = False,
) -> list[dict]:
"""Search Jira issues using JQL.
Args:
project: Jira project key, e.g. "PROJ"
jql: JQL query string
max_results: Maximum number of results
include_subtasks: Whether to include subtask issues
Returns:
List of issue dicts with key, summary, status
"""
...
FastMCP calls inspect.signature() on your function. Extracts parameter names, type annotations, default values. Parses the docstring via docstring-parser (supports Google, NumPy, Sphinx styles) to pull per-parameter descriptions and the top-level summary.
Stage 2: Pydantic model synthesis
FastMCP dynamically builds a Pydantic model from the inspected signature. Roughly equivalent to:
generated model
class SearchJiraInput(BaseModel):
project: str = Field(..., description="Jira project key, e.g. PROJ")
jql: str = Field(..., description="JQL query string")
max_results: int = Field(50, description="Maximum number of results")
include_subtasks: bool = Field(False, description="Whether to include subtask issues")
Stage 3: JSON Schema generation
Pydantic's model_json_schema() walks the model and emits draft-07 JSON Schema. Primitive types map directly. Optional[X] becomes {type: [X, "null"]} or excluded from required. Literal["a", "b"] becomes {enum: ["a", "b"]}. Nested Pydantic models become $ref to a $defs entry.
resulting schema
{
"type": "object",
"properties": {
"project": {
"type": "string",
"description": "Jira project key, e.g. PROJ",
"title": "Project"
},
"jql": {
"type": "string",
"description": "JQL query string",
"title": "Jql"
},
"max_results": {
"type": "integer",
"description": "Maximum number of results",
"default": 50,
"title": "Max Results"
},
"include_subtasks": {
"type": "boolean",
"description": "Whether to include subtask issues",
"default": false,
"title": "Include Subtasks"
}
},
"required": ["project", "jql"],
"title": "SearchJiraInputSchema"
}
The title noise problem
Pydantic adds title fields to every property and to the root. The model doesn't use these and they waste tokens in the tool definition that gets rendered into the system prompt. At 30 tools with 5 parameters each, you're spending 150 title fields of noise per session. FastMCP has options to strip them; worth enabling in production.
Stage 4: Tool envelope
Finally FastMCP wraps the schema with name and description to form the tool definition returned by tools/list. The tool name comes from the function name (or an explicit override). The description comes from the first line of the docstring (the summary), not the full docstring.
Things the inference does poorly
| Python construct | What you get | Why it's bad |
Union[A, B] |
anyOf: [{$ref: A}, {$ref: B}] |
Models struggle to decide which branch to construct |
| Nested models |
$ref into $defs |
Some clients don't resolve refs, some strip them |
| Recursive types |
self-referential $ref |
Infinite expansion in naive renderers |
| Custom validators |
not reflected in schema |
Validation happens at runtime, model can't see the constraint |
| Sphinx docstrings |
silently skipped |
docstring-parser expects Google/NumPy style by default |
For tools called frequently or where schema clarity matters, bypass inference and write the inputSchema by hand. FastMCP accepts an explicit schema parameter on the decorator.
Tools as System Prompt
Here is the thing nobody tells you. When a client calls tools/list and gets back your tool definitions, the host doesn't just hand them to the model as "available functions." The host converts them into text and prepends that text to the model's system prompt. Your tool definitions ARE the prompt. Every tool description, every property description, every enum value is load-bearing prompt engineering.
What the model actually sees
The exact format depends on the host and the model, but the structure is approximately:
approximate system prompt injection
You have access to the following tools. Use them when appropriate
to help the user. Each tool has a name, a description, and
parameters you must provide.
<tool>
<name>search_jira</name>
<description>Search Jira issues using JQL.</description>
<parameters>
project (string, required): Jira project key, e.g. PROJ
jql (string, required): JQL query string
max_results (integer, optional, default 50): Maximum number of results
include_subtasks (boolean, optional, default false): Whether to include subtask issues
</parameters>
</tool>
<tool>
<name>get_issue</name>
<description>Retrieve a single Jira issue by key.</description>
<parameters>
...
</parameters>
</tool>
... (repeat for all N tools) ...
This block is prepended to EVERY turn in the conversation. Not once at session start. Every single turn. The cost compounds.
Token cost estimation
Rough math. A tool definition with a 2-sentence description and 5 parameters with moderate descriptions is about 150-250 tokens. For a moderate MCP server:
| Tool count | Avg tokens/tool | Tools block size | 10-turn conversation cost |
| 10 | 200 | 2,000 tok | 20,000 tok |
| 20 | 200 | 4,000 tok | 40,000 tok |
| 30 | 250 | 7,500 tok | 75,000 tok |
| 50 | 250 | 12,500 tok | 125,000 tok |
For an Atlassian server with a rich API surface (Jira + Confluence + user management + attachments), 30+ tools is realistic. That's a nontrivial chunk of the model's context window spent on tool definitions before any actual work happens.
Implications for writing tools
- Descriptions are prompts. Write "Use this when the user wants to..." not "This function does..."
- Tool names matter. The model uses them as function names.
search_jira vs query_atlassian_issues changes how the model thinks about the tool.
- Property descriptions explain semantics, not types. "Jira project key, e.g. PROJ" not "A string value."
- Enums are self-documenting.
enum: ["open", "in_progress", "closed"] tells the model the legal values without English explanation.
- Required fields force the model's hand. Required = "must construct." Optional = "include if relevant."
- Audit for bloat. Every word in a description costs tokens every turn. Long Pydantic Field descriptions that nobody needed add up fast.
The path forward: dynamic tool filtering
Some hosts (and MCP spec work) are moving toward per-turn tool filtering. The model picks a category first, and only the relevant tools are injected for that turn. For an Atlassian server, this could mean "Jira mode" vs "Confluence mode" with different tool subsets. Not universally supported yet but worth watching -- it's the only real fix for the tool-count-vs-context-cost tradeoff.
Content Block Anatomy
The return value of a tool call is {content: [ContentBlock], isError: bool}. That content array is more interesting than people realize. You can return multiple blocks, mix types, and use the resource block type to avoid inlining content at all.
The four types
text block
{
"type": "text",
"text": "Hello world. This is plain text that lands in the context."
}
image block
{
"type": "image",
"data": "iVBORw0KGgoAAAANSUhEUgAA...", // base64-encoded bytes
"mimeType": "image/png"
}
resource block (embedded)
{
"type": "resource",
"resource": {
"uri": "confluence://TEAM/architecture-overview",
"mimeType": "text/markdown",
"text": "# Architecture Overview\n\n..."
}
}
audio block (2025-03-26+)
{
"type": "audio",
"data": "UklGRiQAAABXQVZF...",
"mimeType": "audio/wav"
}
Size tradeoffs
| Block type | Wire overhead | Context cost | Notes |
| text |
minimal (json escape) |
~1 token per 4 chars |
Best for small-to-medium content. 40KB = ~10K tokens. |
| image |
+33% (base64) |
model-specific vision tokens |
Claude vision: ~1.5k tokens per 1024x1024 image regardless of file size |
| resource (inline) |
minimal |
same as text |
Same cost as text but semantically addressable by uri |
| resource (reference) |
near zero |
near zero until fetched |
Return just the uri, client fetches on demand via resources/read |
The multi-block pattern
You can return multiple blocks in one tool result. This is underused. Example: a Jira search tool that returns a summary text block plus individual resource references for each matching issue.
multi-block jira search result
{
"content": [
{
"type": "text",
"text": "Found 3 matching issues:"
},
{
"type": "resource",
"resource": {
"uri": "jira://PROJ-123",
"text": "PROJ-123: Login button broken - Status: Open - Assignee: Alice"
}
},
{
"type": "resource",
"resource": {
"uri": "jira://PROJ-124",
"text": "PROJ-124: Session expires - Status: In Progress - Assignee: Bob"
}
},
{
"type": "resource",
"resource": {
"uri": "jira://PROJ-125",
"text": "PROJ-125: 2FA errors - Status: Open - Assignee: Alice"
}
}
],
"isError": false
}
Advantages of this shape: the model sees a compact summary plus individually addressable items. If it wants full details for one issue, it can call resources/read with that uri instead of the server having pre-inlined 50KB of issue data for every match.
Text Conversion (ADF · HTML · MD)
Atlassian is a useful case study here because the formats in play are a microcosm of every rich-text integration problem: a native tree structure, a legacy markup variant, and Markdown as the model's lingua franca. Round-tripping between them is harder than it looks because each format has features the others don't.
The formats
| Format | Used by | Shape | Complexity |
| Jira REST v2 |
Jira Server / Data Center |
Plain strings, wiki markup in description/comment bodies |
Flat: bolded fields, minimal structure |
| Storage Format |
Confluence Server / Data Center (and Cloud) page bodies |
XHTML with ac: and ri: namespaces |
Custom macros, layout cells, structured data |
| Wiki Markup |
Legacy Confluence, Jira v2 rich text fields |
Confluence wiki syntax (h1., {code}, etc) |
Still the common case on Server/DC |
| ADF |
Jira Cloud (REST v3), Confluence Cloud (newer) |
JSON tree of typed nodes |
Rich: marks, tables, layouts, extensions |
| Markdown |
Model input/output |
CommonMark + GFM extensions |
Lowest common denominator |
Server/DC vs Cloud note.
Jira Server and Data Center use REST API v2 and never emit ADF. Description and comment bodies come back either as plain text or as Confluence-style wiki markup, depending on the renderer query. ADF is Cloud-only and only on the v3 endpoint. The tree-walk pattern below still applies though: storage format is also a tree (XHTML), wiki markup parses into one, and all three formats hit the same class of round-trip hazards.
ADF as a tree
adf document
{
"type": "doc",
"version": 1,
"content": [
{
"type": "paragraph",
"content": [
{ "type": "text", "text": "Hello " },
{
"type": "text",
"text": "world",
"marks": [{ "type": "strong" }]
}
]
},
{
"type": "bulletList",
"content": [
{
"type": "listItem",
"content": [
{ "type": "paragraph", "content": [{ "type": "text", "text": "first" }] }
]
}
]
},
{
"type": "codeBlock",
"attrs": { "language": "python" },
"content": [{ "type": "text", "text": "print('hi')" }]
}
]
}
Conversion is a tree walk. For each node type, emit the Markdown equivalent. Text nodes with marks become inline formatting. Lists and code blocks become block-level constructs. The algorithm is straightforward until you hit the edge cases.
Hard cases that break naive converters
| Case | Why it breaks | Mitigation |
| Nested marks (bold + italic + link) |
Markdown has limited nesting rules. **_text_** works in CommonMark but some parsers fail. |
Emit strict order: link outermost, then strong, then em |
| Tables with block content |
GFM tables don't support paragraphs or lists inside cells |
Flatten cell content to single-line, or emit as HTML table |
| Code containing backticks |
Single-backtick inline code can't contain backticks |
Use variable-length fencing: wrap in enough backticks to avoid collision |
| Confluence panels (info, warning) |
No Markdown equivalent |
Emit blockquote with prefix: > [INFO] ... |
| Mentions (@user) |
ADF has structured mention nodes with accountId |
Resolve via user cache, emit @displayName |
| Attachments / embedded media |
ri:attachment references are context-sensitive |
Resolve to absolute URL or keep as MCP resource URI |
| Layout cells (multi-column) |
Markdown is single-column |
Drop layout, concatenate content vertically |
| Status pills, dates, emojis |
Structured nodes with attrs, no Markdown form |
Render as text with emoji or bracketed label |
The round-trip problem
You have two conversion directions:
read path (server → model)
Atlassian API → ADF / Storage → Markdown → model context
// This direction is lossy on purpose. Drop features the model can't use.
// Information loss is fine: the model never writes them back.
write path (model → server)
model output → Markdown → ADF / Storage → Atlassian API
// This direction synthesizes structure the model never saw.
// You're upgrading MD to ADF. Only the features in MD are preserved.
The problem is not MD → ADF → MD (stable, lossless for the MD subset). The problem is ADF → MD → ADF, which is lossy. If the model reads a page with panels and layouts, edits one paragraph, and writes it back, the panels and layouts are GONE.
Strategies for the round-trip loss
(1) Read-only for rich features. Convert ADF → MD on read, but for writes require the model to use a patch tool that updates specific sections rather than rewriting the whole page. The rich features stay because you never touched them. (2) Extended MD syntax. Invent custom syntax for panels (::: info) and round-trip it. Works if you control both sides. (3) Preserve-and-restore. On read, strip non-MD features and store a sidecar diff. On write, re-apply the sidecar. Complex but lossless.
Debugging approach for conversion bugs
-
Capture input and output verbatim.
Log the exact ADF JSON you received and the exact Markdown you produced. Conversion bugs hide when you're comparing mental models instead of actual bytes.
-
Build a minimal reproduction.
Strip the ADF tree down until you find the smallest subtree that breaks. Almost always a single node type with specific attrs.
-
Check for encoding issues.
Special chars in text nodes must be escaped for Markdown:
*, _, [, ], backticks. Many converters forget.
-
Check whitespace normalization.
ADF preserves whitespace inside text nodes literally. Markdown collapses spaces. Newlines in ADF become paragraph breaks in MD only if nested in paragraph nodes.
-
Test round-trip on the MD subset.
Build an ADF tree that contains only MD-expressible features. Convert to MD, convert back, diff. If this round-trip fails, your converter has a bug in the MD-native path, not in the edge cases.
Further Rabbit Holes
Things to read and experiment with to keep deepening. Each one leads to better questions.
| Topic | Why it matters | Where to start |
| FastMCP source |
Understand the middleware pipeline, context object, schema generation internals |
mcp/server/fastmcp/server.py in the python-sdk repo |
| JSON-RPC batching |
The spec allows batched requests. Most MCP servers don't handle them. Worth knowing what breaks. |
JSON-RPC 2.0 spec, batch section |
| Resource subscriptions |
The underused primitive. Server pushes updates when a resource changes. Critical for live data. |
MCP spec resources section, implement one in wrapster |
| Sampling (server→client LLM calls) |
Reverse direction: server asks client to do an LLM inference. Enables nested agent patterns. |
MCP spec sampling section |
| Roots |
Client advertises filesystem/workspace boundaries to the server. Constrains what a filesystem server can touch. |
MCP spec roots section |
| Streamable HTTP auth (OAuth 2.1) |
The 2025-03-26 transport adds proper auth. Essential for hosted MCP servers. |
MCP spec auth section |
| Claude Desktop config internals |
How tools are rendered, how prompts surface as slash commands, per-server tool filtering |
claude_desktop_config.json + Desktop logs |
| The _meta field |
Every request can carry a _meta object. progressToken lives here. Custom metadata lives here. Underutilized. |
Grep the spec for "_meta" |
| Confluence storage format reference |
The XHTML-with-namespaces format used by Server, DC, and Cloud page bodies. Tree walk target for most Confluence MCP tools. |
confluence.atlassian.com/doc/confluence-storage-format |
| Confluence storage format |
The XHTML dialect Confluence actually stores. Macros, layouts, attachments. |
confluence.atlassian.com storage format documentation |
The meta-question to hold
Every time MCP feels magical, ask: what bytes went over the wire, and what state changed where. MCP is a thin protocol. There's no hidden layer. Every feature decomposes into JSON-RPC messages over a pipe. If you can't trace it to bytes, you don't understand it yet.
// wire · pump · lifecycle · state · cancel · schema · prompt · content · convert //