将 LangChain 与 Databricks Unity 目录工具集成

2025-05-10

使用 Databricks Unity 目录将 SQL 和 Python 函数作为 LangChain 和 LangGraph 工作流中的工具进行集成。此集成将 Unity 目录的治理与 LangChain 功能相结合，以构建功能强大的基于 LLM 的应用程序。

要求

安装 Python 3.10 及更高版本。

将 LangChain 与 Databricks Unity 目录集成

在此示例中，你将创建 Unity 目录工具，测试其功能，并将其添加到代理。在 Databricks 笔记本中运行以下代码。

安装依赖项

使用 Databricks 可选安装 Unity 目录 AI 包并安装 LangChain 集成包。

此示例使用 LangChain，但类似的方法可以应用于其他库。请参阅将 Unity 目录工具与第三方生成 AI 框架集成。

# Install the Unity Catalog AI integration package with the Databricks extra
%pip install unitycatalog-langchain[databricks]

# Install Databricks Langchain integration package
%pip install databricks-langchain
dbutils.library.restartPython()

初始化 Databricks 函数客户端

初始化 Databricks 功能客户端。

from unitycatalog.ai.core.base import get_uc_function_client

client = get_uc_function_client()

定义工具的逻辑

创建包含工具逻辑的 Unity 目录函数。


CATALOG = "my_catalog"
SCHEMA = "my_schema"

def add_numbers(number_1: float, number_2: float) -> float:
  """
  A function that accepts two floating point numbers adds them,
  and returns the resulting sum as a float.

  Args:
    number_1 (float): The first of the two numbers to add.
    number_2 (float): The second of the two numbers to add.

  Returns:
    float: The sum of the two input numbers.
  """
  return number_1 + number_2

function_info = client.create_python_function(
  func=add_numbers,
  catalog=CATALOG,
  schema=SCHEMA,
  replace=True
)

测试函数

测试函数以检查其是否按预期工作：

result = client.execute_function(
  function_name=f"{CATALOG}.{SCHEMA}.add_numbers",
  parameters={"number_1": 36939.0, "number_2": 8922.4}
)

result.value # OUTPUT: '45861.4'

使用 UCFunctionToolKit 封装函数

使用 UCFunctionToolkit 函数包装函数，使代理创作库可以访问该函数。该工具包可确保不同库的一致性，并添加了有用的功能，例如检索器的自动跟踪。

from databricks_langchain import UCFunctionToolkit

# Create a toolkit with the Unity Catalog function
func_name = f"{CATALOG}.{SCHEMA}.add_numbers"
toolkit = UCFunctionToolkit(function_names=[func_name])

tools = toolkit.tools

在代理中使用该工具

使用 tools 属性从 UCFunctionToolkit 中将该工具添加到 LangChain 代理。

此示例使用 LangChain 的 AgentExecutor API 创建一个简单的代理，以便于简单。对于生产工作负荷，请使用示例中显示的ChatAgent代理编写工作流。

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.prompts import ChatPromptTemplate
from databricks_langchain import (
  ChatDatabricks,
  UCFunctionToolkit,
)
import mlflow

# Initialize the LLM (replace with your LLM of choice, if desired)
LLM_ENDPOINT_NAME = "databricks-meta-llama-3-3-70b-instruct"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME, temperature=0.1)

# Define the prompt
prompt = ChatPromptTemplate.from_messages(
  [
    (
      "system",
      "You are a helpful assistant. Make sure to use tools for additional functionality.",
    ),
    ("placeholder", "{chat_history}"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
  ]
)

# Enable automatic tracing
mlflow.langchain.autolog()

# Define the agent, specifying the tools from the toolkit above
agent = create_tool_calling_agent(llm, tools, prompt)

# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "What is 36939.0 + 8922.4?"})

通过