Skip to main content

When to build from scratch

  • You want full control over the agent loop
  • You don’t need LLM framework integrations
  • You’re building custom orchestration logic
  • You want minimal dependencies

Create an agent

superserve create-agent my_agent

The @superserve.tool decorator

Define tools that execute on Ray workers:
import superserve

@superserve.tool(num_cpus=1, memory="512MB")
def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Results for: {query}"

# Tools are async - use await
result = await search_web(query="Python tutorials")

The Agent base class

Subclass Agent to create a custom agent:
import superserve
from superserve import Agent

@superserve.tool
def search(query: str) -> str:
    """Search for information."""
    return f"Results for: {query}"

class MyAgent(Agent):
    tools = [search]

    async def run(self, query: str) -> str:
        result = await self.call_tool("search", query=query)
        return f"Found: {result}"

# Serve with resources
superserve.serve(MyAgent, name="my-agent", num_cpus=2, memory="4GB", replicas=2)

Parallel tool execution

Use call_tools_parallel() to run multiple tools simultaneously:
class MyAgent(Agent):
    tools = [search_web, fetch_data, analyze_text]

    async def run(self, query: str) -> str:
        results = await self.call_tools_parallel([
            ("search_web", {"query": query}),
            ("fetch_data", {"url": "https://api.example.com"}),
        ])
        return str(results)

Next steps