Chapter 7 ยท ADVANCED

Model Context Protocol (MCP) Server

๐Ÿ“„ 07_model_context_protocol__mcp__server.md ๐Ÿท Advanced

Chapter 7: Model Context Protocol (MCP) Server

In the previous chapter, AI & Intelligence Integration, we taught the MarkItDown library how to use AI (like GPT-4o) to describe images.

In this final chapter, we are going to flip the script. Instead of code using AI, we are going to enable AI to use our code.

We will explore the Model Context Protocol (MCP) Server, a wrapper that turns the MarkItDown library into a tool that AI assistants (like Claude Desktop) can control directly.

The Motivation: giving AI "Hands"

Imagine you are chatting with an AI assistant on your desktop. You have a complex Excel file, and you want to ask: "Summarize the sales trends in this file."

Usually, you have to:

  1. Open the Excel file yourself.
  2. Copy the data (losing formatting).
  3. Paste it into the chat window.

Wouldn't it be better if the AI could just reach out, read the file directly using MarkItDown, and give you the answer?

The Solution: A Universal Adapter

This is what the Model Context Protocol (MCP) does.

Think of MCP as a USB port for AI.

By running MarkItDown as an MCP Server, you are effectively telling the AI: "Here is a tool named markitdown. You can send it a filename, and it will send you back text."

Usage: Connecting to Claude Desktop

Let's look at a real-world example of how to "install" MarkItDown into the Claude Desktop app.

Step 1: The Configuration

You don't write Python code to use the server; you configure your AI agent to launch it.

If you are using Claude Desktop, you would edit your configuration file (claude_desktop_config.json) to include this entry:

{
  "mcpServers": {
    "markitdown": {
      "command": "markitdown-mcp"
    }
  }
}

Step 2: The Experience

Once configured, when you open Claude, it will see a tool available called convert.

  1. You: "Can you read the file C:/data/quarterly_report.pdf?"
  2. Claude (Internal Thought): I can't read files directly, but I have a tool named markitdown. I will use it.
  3. Claude (Action): Calls the MCP Server.
  4. MarkItDown: Converts the PDF to Markdown and sends it back.
  5. Claude: "The report states that revenue is up 20%..."

How It Works Under the Hood

The magic happens in the markitdown-mcp package. It wraps the MarkItDown Orchestrator in a web server (or a standard input/output stream).

Let's visualize the communication flow.

sequenceDiagram participant User participant AI as AI Agent (Claude) participant Server as MCP Server participant MD as MarkItDown Class User->>AI: "Read report.xlsx" Note over AI: Decides to use tool AI->>Server: Call Tool: convert("report.xlsx") Server->>MD: md.convert("report.xlsx") MD-->>Server: Returns Markdown text Server-->>AI: Returns Result AI-->>User: "Here is the summary of the report..."

Implementation: The FastMCP Wrapper

The implementation is surprisingly simple because it uses a library called FastMCP. Let's look at the code inside markitdown_mcp/__main__.py.

1. Initializing the Server

First, the code creates a server instance. This is the "Listener" that waits for commands from the AI.

from mcp.server.fastmcp import FastMCP
from markitdown import MarkItDown

# Create the MCP Server application
mcp = FastMCP("markitdown")

2. Exposing the Tool

Next, we define a function and decorate it with @mcp.tool(). This decorator tells the system: "Expose this function to the AI."

@mcp.tool()
async def convert(uri: str) -> str:
    """
    Convert a file or URL to markdown.
    """
    # 1. Initialize MarkItDown
    md = MarkItDown(enable_plugins=check_plugins())
    
    # 2. Perform the conversion
    result = md.convert(uri)
    
    # 3. Return the text
    return result.text_content

Note: The actual code uses .convert_uri(), which is a helper wrapper around the standard .convert() method we learned in Chapter 1.

3. Running the Loop

Finally, the script starts the server. It can run in two modes:

  1. STDIO (Standard Input/Output): The AI launches the script and talks to it via text pipes (common for Desktop apps).
  2. HTTP (SSE): The server runs on a port (like localhost:3001), and the AI talks to it over the network.
# Simplified from main()
if use_http:
    # Run as a Web Server (SSE)
    uvicorn.run(starlette_app, port=3001)
else:
    # Run as a local pipe (Default for Desktop Apps)
    mcp.run()

The StreamableHTTPSessionManager

If running in HTTP mode, the server uses Server-Sent Events (SSE). This allows the server to push updates to the AI agent in real-time.

The code sets up a Starlette app (a fast web framework) to handle these connections.

# Simplified setup for HTTP mode
sse = SseServerTransport("/messages/")

def create_starlette_app(mcp_server):
    return Starlette(
        routes=[
            Route("/sse", endpoint=handle_sse),
            Mount("/messages/", app=sse.handle_post_message),
        ]
    )

This creates an endpoint at /sse that the AI agent connects to.

Security Considerations

When you run an MCP server, you are giving an AI access to your file system (via MarkItDown).

  1. Read-Only: MarkItDown is generally a read-only tool. It reads files and returns text. It does not delete or modify your files.
  2. Plugins: The server checks an environment variable MARKITDOWN_ENABLE_PLUGINS.
    def check_plugins_enabled() -> bool:
        # Defaults to "false" for safety
        return os.getenv("MARKITDOWN_ENABLE_PLUGINS", "false") == "true"
    

This ensures that experimental or third-party code doesn't run unless you explicitly allow it.

Summary

In this tutorial series, we have traveled from the basics of file conversion to advanced AI integration.

  1. The Orchestrator: We met MarkItDown, the general contractor.
  2. Converters: We learned how it delegates work to specialists for Excel, PDF, and Web.
  3. AI Integration: We saw how to use LLMs to describe images.
  4. MCP Server: Finally, in this chapter, we turned MarkItDown into a "Skill" that other AI agents can use.

The MCP Server acts as the bridge between your documents and the rapidly evolving world of AI agents. By running this server, you unlock the ability to have conversations with your data, no matter what format it is in.

Where to go from here?

You now understand the entire architecture of MarkItDown!

Thank you for reading the MarkItDown Tutorial!


Generated by Code IQ