Share via


Model context protocol (MCP) on Databricks

This page explains how to use MCP on Databricks. MCP is an open source standard that connects AI agents to tools, resources, prompts, and other contextual information.

The main benefit of MCP is standardization. You can create a tool once and use it with any agent—whether it’s one you’ve built or a third-party agent. Similarly, you can use tools developed by others, either from your team or from outside your organization.

Databricks provides the following MCP options:

  • Managed MCP servers: Databricks has ready-to-use servers that let agents query data and access tools in Unity Catalog. Unity Catalog permissions are always enforced, so agents and users can only access the tools and data they’re allowed to.

  • Custom MCP servers: Securely host your own MCP server as a Databricks app to bring your own server or run a third-party MCP server.

Managed MCP servers

Important

This feature is in Beta.

Databricks provides the following managed MCP servers to simplify connecting agents to enterprise data. These servers are available out of the box and are hosted and maintained by Databricks:

MCP server Description URL
Vector search Lets agents query Databricks Vector Search indexes in the specified Unity Catalog schema. https://<your-workspace-hostname>/api/2.0/mcp/vector-search/{catalog_name}/{schema_name}
Unity Catalog functions Lets agents run Unity Catalog functions in the specified Unity Catalog schema. https://<your-workspace-hostname>/api/2.0/mcp/functions/{catalog_name}/{schema_name}
Genie space Lets agents query the specified Genie space to get insights from structured data (tables in Unity Catalog) https://<your-workspace-hostname>/api/2.0/mcp/genie/{genie_space_id}

You can provide your agent multiple server URLs to connect to multiple tools and data. For example, you could provide the following URLs to let an agent query customer support tickets, billing tables, and run billing-related functions:

  • https://<your-workspace-hostname>/api/2.0/mcp/vector-search/prod/customer_support

    • Lets your agent search unstructured data using vector search indexes in the prod.customer_support schema
  • https://<your-workspace-hostname>/api/2.0/mcp/genie/{genie_space_id}

    • Lets your agent search structured data using a Genie space connected to the prod.billing schema
  • https://<your-workspace-hostname>/api/2.0/mcp/functions/prod/billing

    • Lets your agent run Unity Catalog functions (custom Python or SQL data retrieval UDFs) governed in prod.billing

Host MCP servers using Databricks apps

You can also host your own custom or third-party MCP servers as Databricks apps. This is useful if you already have MCP servers you want to deploy and share with others in your organization or if you want to run a third-party MCP server as a source of tools.

An MCP server hosted as a Databricks app must implement an HTTP-compatible transport, such as the streamable HTTP transport.

Tip

See the custom MCP server repo for an example of writing your own MCP server and deploying it as a Databricks app.

To host an existing Python MCP server as a Databricks app, follow these steps:

  1. Add a requirements.txt to your server's root directory and specify Python dependencies for your server.

    Python MCP servers often use uv for package management. If you use uv, add uv and it will handle installing additional dependencies.

  2. Add an app.yaml specifying the CLI command to run your server.

    By default, Databricks apps listen on port 8000. If your server listens on a different port, set it using an environment variable override in the app.yaml file.

    Example app.yaml:

    command: [
        'uv',
        'run',
        'your-server-name',
        ..., # optionally include additional parameters here
      ]
    
  3. Create a Databricks app to host your MCP server:

    databricks apps create mcp-my-custom-server
    
  4. Upload the source code to Databricks and deploy the app by running the following commands from the directory containing your app.yaml file:

    DATABRICKS_USERNAME=$(databricks current-user me | jq -r .userName)
    databricks sync . "/Users/$DATABRICKS_USERNAME/mcp-my-custom-server"
    databricks apps deploy mcp-my-custom-server --source-code-path "/Workspace/Users/$DATABRICKS_USERNAME/mcp-my-custom-server"
    

Build agents using MCP

This section shows you how to write custom code agents that connect to MCP servers on Databricks.

Important

You must enroll in the managed MCP servers beta to use the following code snippets.

Connecting to an MCP server on Databricks is similar to any other remote MCP server. You can connect to the server using standard SDKs, such as the MCP Python SDK. The main difference is that Databricks MCP servers are secure by default and require clients to specify authentication. The databricks-mcp Python library helps simplify authentication in custom agent code.

The simplest way to develop agent code is to run it locally and authenticate to your workspace. Use the following steps to build an AI agent that connects to a Databricks MCP server:

  1. Use OAuth to authenticate to your workspace. Run the following in a local terminal:

    databricks auth login --host https://<your-workspace-hostname>
    
  2. Ensure you have a local environment with Python 3.12 or above, then install dependencies:

    pip install -U databricks-mcp "mcp>=1.9" "databricks-sdk[openai]" "mlflow>=3.1.0" "databricks-agents>=1.0.0"
    
  3. Run the following snippet to validate your connection to the MCP server. The snippet lists your Unity Catalog tools and then executes the built-in Python code interpreter tool. Serverless compute must be enabled in your workspace to run this snipet

    import asyncio
    
    from mcp.client.streamable_http import streamablehttp_client
    from mcp.client.session import ClientSession
    from databricks_mcp import DatabricksOAuthClientProvider
    from databricks.sdk import WorkspaceClient
    
    # TODO: Update to the Databricks CLI profile name you specified when
    # configuring authentication to the workspace.
    databricks_cli_profile = "YOUR_DATABRICKS_CLI_PROFILE"
    assert (
        databricks_cli_profile != "YOUR_DATABRICKS_CLI_PROFILE"
    ), "Set databricks_cli_profile to the Databricks CLI profile name you specified when configuring authentication to the workspace"
    workspace_client = WorkspaceClient(profile=databricks_cli_profile)
    workspace_hostname = workspace_client.config.host
    mcp_server_url = f"{workspace_hostname}/api/2.0/mcp/functions/system/ai"
    
    
    # This snippet below uses the Unity Catalog functions MCP server to expose built-in
    # AI tools under `system.ai`, like the `system.ai.python_exec` code interpreter tool
    async def test_connect_to_server():
        async with streamablehttp_client(
            f"{mcp_server_url}", auth=DatabricksOAuthClientProvider(workspace_client)
        ) as (read_stream, write_stream, _), ClientSession(
            read_stream, write_stream
        ) as session:
            # List and call tools from the MCP server
            await session.initialize()
            tools = await session.list_tools()
            print(
                f"Discovered tools {[t.name for t in tools.tools]} "
                f"from MCP server {mcp_server_url}"
            )
            result = await session.call_tool(
                "system__ai__python_exec", {"code": "print('Hello, world!')"}
            )
            print(
                f"Called system__ai__python_exec tool and got result " f"{result.content}"
            )
    
    
    if __name__ == "__main__":
        asyncio.run(test_connect_to_server())
    
  4. You can build on the snippet above to define a basic single-turn agent that uses tools. Save the agent code locally as a file named mcp_agent.py to enable deploying it in subsequent sections:

    from contextlib import asynccontextmanager
    import json
    import uuid
    import asyncio
    from typing import Any, Callable, List
    from pydantic import BaseModel
    
    import mlflow
    from mlflow.pyfunc import ResponsesAgent
    from mlflow.types.responses import ResponsesAgentRequest, ResponsesAgentResponse
    
    from databricks_mcp import DatabricksOAuthClientProvider
    from databricks.sdk import WorkspaceClient
    from mcp.client.session import ClientSession
    from mcp.client.streamable_http import streamablehttp_client
    
    # 1) CONFIGURE YOUR ENDPOINTS/PROFILE
    LLM_ENDPOINT_NAME = "databricks-claude-3-7-sonnet"
    SYSTEM_PROMPT = "You are a helpful assistant."
    DATABRICKS_CLI_PROFILE = "YOUR_DATABRICKS_CLI_PROFILE"
    assert (
        DATABRICKS_CLI_PROFILE != "YOUR_DATABRICKS_CLI_PROFILE"
    ), "Set DATABRICKS_CLI_PROFILE to the Databricks CLI profile name you specified when configuring authentication to the workspace"
    workspace_client = WorkspaceClient(profile=DATABRICKS_CLI_PROFILE)
    host = workspace_client.config.host
    # Add more MCP server URLs here if desired, e.g
    # f"{host}/api/2.0/mcp/vector-search/prod/billing"
    # to include vector search indexes under the prod.billing schema, or
    # f"{host}/api/2.0/mcp/genie/<genie_space_id>"
    # to include a Genie space
    MCP_SERVER_URLS = [
        f"{host}/api/2.0/mcp/functions/system/ai",
    ]
    
    
    # 2) HELPER: convert between ResponsesAgent “message dict” and ChatCompletions format
    def _to_chat_messages(msg: dict[str, Any]) -> List[dict]:
        """
        Take a single ResponsesAgent‐style dict and turn it into one or more
        ChatCompletions‐compatible dict entries.
        """
        msg_type = msg.get("type")
        if msg_type == "function_call":
            return [
                {
                    "role": "assistant",
                    "content": None,
                    "tool_calls": [
                        {
                            "id": msg["call_id"],
                            "type": "function",
                            "function": {
                                "name": msg["name"],
                                "arguments": msg["arguments"],
                            },
                        }
                    ],
                }
            ]
        elif msg_type == "message" and isinstance(msg["content"], list):
            return [
                {
                    "role": "assistant" if msg["role"] == "assistant" else msg["role"],
                    "content": content["text"],
                }
                for content in msg["content"]
            ]
        elif msg_type == "function_call_output":
            return [
                {
                    "role": "tool",
                    "content": msg["output"],
                    "tool_call_id": msg["tool_call_id"],
                }
            ]
        else:
            # fallback for plain {"role": ..., "content": "..."} or similar
            return [
                {
                    k: v
                    for k, v in msg.items()
                    if k in ("role", "content", "name", "tool_calls", "tool_call_id")
                }
            ]
    
    
    # 3) “MCP SESSION” + TOOL‐INVOCATION LOGIC
    @asynccontextmanager
    async def _mcp_session(server_url: str, ws: WorkspaceClient):
        async with streamablehttp_client(
            url=server_url, auth=DatabricksOAuthClientProvider(ws)
        ) as (reader, writer, _):
            async with ClientSession(reader, writer) as session:
                await session.initialize()
                yield session
    
    
    def _list_tools(server_url: str, ws: WorkspaceClient):
        async def inner():
            async with _mcp_session(server_url, ws) as sess:
                return await sess.list_tools()
    
        return asyncio.run(inner())
    
    
    def _make_exec_fn(
        server_url: str, tool_name: str, ws: WorkspaceClient
    ) -> Callable[..., str]:
        def exec_fn(**kwargs):
            async def call_it():
                async with _mcp_session(server_url, ws) as sess:
                    resp = await sess.call_tool(name=tool_name, arguments=kwargs)
                    return "".join([c.text for c in resp.content])
    
            return asyncio.run(call_it())
    
        return exec_fn
    
    
    class ToolInfo(BaseModel):
        name: str
        spec: dict
        exec_fn: Callable
    
    
    def _fetch_tool_infos(ws: WorkspaceClient, server_url: str) -> List[ToolInfo]:
        print(f"Listing tools from MCP server {server_url}")
        infos: List[ToolInfo] = []
        mcp_tools = _list_tools(server_url, ws).tools
        for t in mcp_tools:
            schema = t.inputSchema.copy()
            if "properties" not in schema:
                schema["properties"] = {}
            spec = {
                "type": "function",
                "function": {
                    "name": t.name,
                    "description": t.description,
                    "parameters": schema,
                },
            }
            infos.append(
                ToolInfo(
                    name=t.name, spec=spec, exec_fn=_make_exec_fn(server_url, t.name, ws)
                )
            )
        return infos
    
    
    # 4) “SINGLE‐TURN” AGENT CLASS
    class SingleTurnMCPAgent(ResponsesAgent):
        def _call_llm(self, history: List[dict], ws: WorkspaceClient, tool_infos):
            """
            Send current history → LLM, returning the raw response dict.
            """
            client = ws.serving_endpoints.get_open_ai_client()
            flat_msgs = []
            for msg in history:
                flat_msgs.extend(_to_chat_messages(msg))
    
            return client.chat.completions.create(
                model=LLM_ENDPOINT_NAME,
                messages=flat_msgs,
                tools=[ti.spec for ti in tool_infos],
            )
    
        def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
            ws = WorkspaceClient(profile=DATABRICKS_CLI_PROFILE)
    
            # 1) build initial history: system + user
            history: List[dict] = [{"role": "system", "content": SYSTEM_PROMPT}]
            for inp in request.input:
                history.append(inp.model_dump())
    
            # 2) call LLM once
            tool_infos = [
                tool_info
                for mcp_server_url in MCP_SERVER_URLS
                for tool_info in _fetch_tool_infos(ws, mcp_server_url)
            ]
            tools_dict = {tool_info.name: tool_info for tool_info in tool_infos}
            llm_resp = self._call_llm(history, ws, tool_infos)
            raw_choice = llm_resp.choices[0].message.to_dict()
            raw_choice["id"] = uuid.uuid4().hex
            history.append(raw_choice)
    
            tool_calls = raw_choice.get("tool_calls") or []
            if tool_calls:
                # (we only support a single tool in this “single‐turn” example)
                fc = tool_calls[0]
                name = fc["function"]["name"]
                args = json.loads(fc["function"]["arguments"])
                try:
                    tool_info = tools_dict[name]
                    result = tool_info.exec_fn(**args)
                except Exception as e:
                    result = f"Error invoking {name}: {e}"
    
                # 4) append the “tool” output
                history.append(
                    {
                        "type": "function_call_output",
                        "role": "tool",
                        "id": uuid.uuid4().hex,
                        "tool_call_id": fc["id"],
                        "output": result,
                    }
                )
    
                # 5) call LLM a second time and treat that reply as final
                followup = (
                    self._call_llm(history, ws, tool_infos=[]).choices[0].message.to_dict()
                )
                followup["id"] = uuid.uuid4().hex
    
                assistant_text = followup.get("content", "")
                return ResponsesAgentResponse(
                    output=[
                        {
                            "id": uuid.uuid4().hex,
                            "type": "message",
                            "role": "assistant",
                            "content": [{"type": "output_text", "text": assistant_text}],
                        }
                    ],
                    custom_outputs=request.custom_inputs,
                )
    
            # 6) if no tool_calls at all, return the assistant’s original reply
            assistant_text = raw_choice.get("content", "")
            return ResponsesAgentResponse(
                output=[
                    {
                        "id": uuid.uuid4().hex,
                        "type": "message",
                        "role": "assistant",
                        "content": [{"type": "output_text", "text": assistant_text}],
                    }
                ],
                custom_outputs=request.custom_inputs,
            )
    
    
    mlflow.models.set_model(SingleTurnMCPAgent())
    
    if __name__ == "__main__":
        req = ResponsesAgentRequest(
            input=[{"role": "user", "content": "What's the 100th Fibonacci number?"}]
        )
        resp = SingleTurnMCPAgent().predict(req)
        for item in resp.output:
            print(item)
    

Deploy agents using MCP

When you're ready to deploy an agent that connects to MCP servers, use the standard agent deployment process.

Make sure to specify all the resources your agent needs access to at logging time. For example, if your agent uses the following MCP server URLs:

  • https://<your-workspace-hostname>/api/2.0/mcp/vector-search/prod/customer_support
  • https://<your-workspace-hostname>/api/2.0/mcp/vector-search/prod/billing
  • https://<your-workspace-hostname>/api/2.0/mcp/functions/prod/billing

You must specify all the vector search indexes your agent needs in the prod.customer_support and prod.billing schemas as resources, as well as all the Unity Catalog functions in prod.billing.

For example, to deploy the agent defined above, you can run the following snippet, assuming you saved the agent code definition in mcp_agent.py:

import os
from databricks.sdk import WorkspaceClient
from databricks import agents
import mlflow
from mlflow.models.resources import DatabricksFunction, DatabricksServingEndpoint, DatabricksVectorSearchIndex
from mcp_agent import LLM_ENDPOINT_NAME

# TODO: Update this to your Databricks CLI profile name
databricks_cli_profile = "YOUR_DATABRICKS_CLI_PROFILE"
assert (
        databricks_cli_profile != "YOUR_DATABRICKS_CLI_PROFILE"
), "Set databricks_cli_profile to the Databricks CLI profile name you specified when configuring authentication to the workspace"
workspace_client = WorkspaceClient(profile=databricks_cli_profile)


# Configure MLflow and the Databricks SDK to use your Databricks CLI profile
current_user = workspace_client.current_user.me().user_name
mlflow.set_tracking_uri(f"databricks://{databricks_cli_profile}")
mlflow.set_registry_uri(f"databricks-uc://{databricks_cli_profile}")
mlflow.set_experiment(f"/Users/{current_user}/databricks_docs_example_mcp_agent")
os.environ["DATABRICKS_CONFIG_PROFILE"] = databricks_cli_profile

# Log the agent defined in mcp_agent.py
here = os.path.dirname(os.path.abspath(__file__))
agent_script = os.path.join(here, "mcp_agent.py")
resources = [
    DatabricksServingEndpoint(endpoint_name=LLM_ENDPOINT_NAME),
    DatabricksFunction("system.ai.python_exec"),
    # --- Uncomment and edit the following lines to specify vector search indices and additional UC functions ---
    # --- if referenced via the MCP_SERVER_URLS in your agent code ---
    # DatabricksVectorSearchIndex(index_name="prod.customer_support.my_index"),
    # DatabricksVectorSearchIndex(index_name="prod.billing.another_index"),
    # DatabricksFunction("prod.billing.my_custom_function"),
    # DatabricksFunction("prod.billing.another_function"),
]

with mlflow.start_run():
    logged_model_info = mlflow.pyfunc.log_model(
        artifact_path="mcp_agent",
        python_model=agent_script,
        resources=resources,
    )

# TODO Specify your UC model name here
UC_MODEL_NAME = "main.default.databricks_docs_mcp_agent"
registered_model = mlflow.register_model(logged_model_info.model_uri, UC_MODEL_NAME)

agents.deploy(
    model_name=UC_MODEL_NAME,
    model_version=registered_model.version,
)

Compute pricing

Compute pricing for managed MCP servers depends on the MCP workloads:

Custom MCP servers are subject to Databricks Apps pricing.