MLflow 跟踪与一系列常用的生成式 AI 库和框架集成,为所有库和框架提供 单行自动跟踪 体验。 这样,就可以在最少的设置下立即获得 GenAI 应用程序的可观测性。
自动跟踪根据特定库或 SDK 的实现方式捕获应用程序逻辑和中间步骤,例如 LLM 调用、工具使用情况和代理交互。
若要深入了解自动跟踪的工作原理、其先决条件以及将其与手动跟踪相结合的示例,请参阅主要的 自动跟踪 指南。 下面的快速示例突出显示了一些顶级集成。 本部分各自页面上提供了每个受支持的库的详细指南,涵盖先决条件、高级示例和特定行为。
顶级集成概览
下面是一些最常用的集成快速入门示例。 单击选项卡可查看基本用法示例。 有关每个方案的详细先决条件和更高级的方案,请访问其专用集成页(从选项卡或下面的列表链接)。
开放人工智能
import mlflow
import openai
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
# Enable auto-tracing for OpenAI
mlflow.openai.autolog()
# Set up MLflow tracking
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/openai-tracing-demo")
openai_client = openai.OpenAI()
messages = [
{
"role": "user",
"content": "What is the capital of France?",
}
]
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.1,
max_tokens=100,
)
# View trace in MLflow UI
LangChain
import mlflow
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
mlflow.langchain.autolog()
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langchain-tracing-demo")
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000)
prompt = PromptTemplate.from_template("Tell me a joke about {topic}.")
chain = prompt | llm | StrOutputParser()
chain.invoke({"topic": "artificial intelligence"})
# View trace in MLflow UI
LangGraph
import mlflow
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
mlflow.langchain.autolog() # LangGraph uses LangChain's autolog
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langgraph-tracing-demo")
@tool
def get_weather(city: str):
"""Use this to get weather information."""
return f"It might be cloudy in {city}"
llm = ChatOpenAI(model="gpt-4o-mini")
graph = create_react_agent(llm, [get_weather])
result = graph.invoke({"messages": [("user", "what is the weather in sf?")]})
# View trace in MLflow UI
人类学的
import mlflow
import anthropic
import os
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
mlflow.anthropic.autolog()
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/anthropic-tracing-demo")
client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello, Claude"}],
)
# View trace in MLflow UI
DSPy
import mlflow
import dspy
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
mlflow.dspy.autolog()
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/dspy-tracing-demo")
lm = dspy.LM("openai/gpt-4o-mini") # Assumes OPENAI_API_KEY is set
dspy.configure(lm=lm)
class SimpleSignature(dspy.Signature):
input_text: str = dspy.InputField()
output_text: str = dspy.OutputField()
program = dspy.Predict(SimpleSignature)
result = program(input_text="Summarize MLflow Tracing.")
# View trace in MLflow UI
Databricks
import mlflow
import os
from openai import OpenAI # Databricks FMAPI uses OpenAI client
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
mlflow.openai.autolog() # Traces Databricks FMAPI via OpenAI client
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/databricks-fmapi-tracing")
client = OpenAI(
api_key=os.environ.get("DATABRICKS_TOKEN"),
base_url=f"{os.environ.get('DATABRICKS_HOST')}/serving-endpoints"
)
response = client.chat.completions.create(
model="databricks-llama-4-maverick",
messages=[{"role": "user", "content": "Key features of MLflow?"}],
)
# View trace in MLflow UI
基岩
import mlflow
import boto3
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
mlflow.bedrock.autolog()
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/bedrock-tracing-demo")
bedrock = boto3.client(
service_name="bedrock-runtime",
region_name="us-east-1" # Replace with your region
)
response = bedrock.converse(
modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
messages=[{"role": "user", "content": "Hello World in one line."}]
)
# View trace in MLflow UI
AutoGen
import mlflow
from autogen import ConversableAgent
import os
# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"
mlflow.autogen.autolog()
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/autogen-tracing-demo")
config_list = [{"model": "gpt-4o-mini", "api_key": os.environ.get("OPENAI_API_KEY")}]
assistant = ConversableAgent("assistant", llm_config={"config_list": config_list})
user_proxy = ConversableAgent("user_proxy", human_input_mode="NEVER", code_execution_config=False)
user_proxy.initiate_chat(assistant, message="What is 2+2?")
# View trace in MLflow UI
启用多个自动跟踪集成
由于 GenAI 应用程序通常合并多个库,因此 MLflow 跟踪允许同时为多个集成启用自动跟踪,从而提供统一的跟踪体验。
例如,若要启用 LangChain 和直接 OpenAI 跟踪,
import mlflow
# Enable MLflow Tracing for both LangChain and OpenAI
mlflow.langchain.autolog()
mlflow.openai.autolog()
# Your code using both LangChain and OpenAI directly...
# ... an example can be found on the Automatic Tracing page ...
MLflow 将生成一个统一的跟踪,该跟踪结合了 LangChain 和直接 OpenAI LLM 调用中的步骤,使你能够检查完整的流。 可以在 “自动跟踪 ”页上找到更多合并集成示例。
禁用自动跟踪
可以通过调用 mlflow.<library>.autolog(disable=True)
来禁用任何特定库的自动跟踪。
若要一次性禁用所有自动记录集成,请使用 mlflow.autolog(disable=True)
。
import mlflow
# Disable for a specific library
mlflow.openai.autolog(disable=True)
# Disable all autologging
mlflow.autolog(disable=True)