Skip to main content
Use superserve.serve_mcp() to deploy MCP (Model Context Protocol) servers as scalable HTTP endpoints via Ray Serve.

Quick Start

Create a new MCP server:
superserve create-mcp weather
This creates mcp_servers/weather/server.py:
from mcp.server.fastmcp import FastMCP
import superserve

mcp = FastMCP("weather", stateless_http=True)


@mcp.tool()
async def example_tool(query: str) -> str:
    """Example tool - replace with your own.

    Args:
        query: Input query to process
    """
    return f"Processed: {query}"


superserve.serve_mcp(mcp, name="weather")
Run the server:
superserve up
Your MCP server is now available at http://localhost:8000/weather/mcp.

Adding Tools

Define tools using FastMCP’s @mcp.tool() decorator:
from mcp.server.fastmcp import FastMCP
import superserve

mcp = FastMCP("myserver", stateless_http=True)


@mcp.tool()
async def search(query: str) -> str:
    """Search for information.

    Args:
        query: Search query
    """
    return f"Results for: {query}"


@mcp.tool()
async def add_numbers(a: int, b: int) -> int:
    """Add two numbers together.

    Args:
        a: First number
        b: Second number
    """
    return a + b


superserve.serve_mcp(mcp, name="myserver")

Using MCP Servers with Agents

Here’s an example using Pydantic AI:
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStreamableHTTP
import superserve

weather_server = MCPServerStreamableHTTP("http://localhost:8000/weather/mcp")

def make_agent():
    return Agent(
        "openai:gpt-4o-mini",
        system_prompt="You are a helpful assistant.",
        toolsets=[weather_server],
    )

superserve.serve(make_agent, name="my_agent")
When you run superserve up, both the MCP server and agent start on the same port. The agent can then use tools from the MCP server.

serve_mcp Options

OptionTypeDefaultDescription
mcp_serverFastMCPrequiredFastMCP instance (must use stateless_http=True)
namestrNoneServer name (inferred from directory if not set)
num_cpusfloat1CPU cores per replica
num_gpusfloat0GPUs per replica
memorystr”512MB”Memory per replica
replicasint1Number of replicas
route_prefixstr/{name}URL prefix (endpoint: /{name}/mcp)