自动跟踪

2025-06-11

MLflow 跟踪与各种 GenAI 库集成，并为每个库（以及它们的组合）提供 单行自动跟踪 体验。本页显示了将 MLflow 与常用 GenAI 库集成的详细示例。

先决条件

MLflow 3

本指南需要以下包：

mlflow[databricks]>=3.1：具有 GenAI 功能和 Databricks 连接的核心 MLflow 功能。
openai>=1.0.0：仅需要在此页上运行基本自动跟踪示例（如果使用其他 LLM 提供程序，请改为安装各自的 SDK）
其他库：为要使用的集成安装特定库

安装基本组件：

%pip install --upgrade "mlflow[databricks]>=3.1" openai>=1.0.0

MLflow 2.x

本指南需要以下包：

mlflow[databricks]>=2.15.0,3.0.0<：具有 Databricks 连接的核心 MLflow 功能。
openai>=1.0.0：仅需要在此页上运行基本自动跟踪示例（如果使用其他 LLM 提供程序，请改为安装各自的 SDK）
其他库：为要使用的集成安装特定库

安装基本组件：

%pip install --upgrade "mlflow[databricks]>=2.15.0,<3.0.0" openai>=1.0.0

注释

虽然 MLflow 2.15.0+ 中提供了自动跟踪功能，但强烈建议安装 MLflow 3（特别是 3.1 或更高版本（如果使用mlflow[databricks]）以获取最新的 GenAI 功能，包括扩展的跟踪功能和可靠的支持。

小窍门

在 Databricks Notebook 中运行？ MLflow 在 Databricks 运行时中预安装。只需为要跟踪的特定库安装其他包。

在本地运行？ 需要安装上面列出的所有包以及任何其他集成库。

Databricks 安装程序的先决条件

在运行以下示例中的任何一个之前，请确保已为 Databricks 配置 MLflow 跟踪：

对于 Databricks 笔记本外部的用户

如果在 Databricks 笔记本外部运行，请设置环境变量：

export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-personal-access-token"

对于 Databricks 笔记本中的用户

如果您在 Databricks 笔记本环境中运行，系统会自动为您设置这些凭据。只需配置 LLM 提供程序 API 密钥。

LLM 服务商 API 密钥

为计划使用的 LLM 提供程序设置 API 密钥：

export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export MISTRAL_API_KEY="your-mistral-api-key"
# Add other provider keys as needed

基本自动跟踪示例

下面介绍如何在一行中为 OpenAI 启用自动跟踪：

import mlflow
from openai import OpenAI
import os

# Set up MLflow tracking
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/automatic-tracing-demo")

# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

# Enable automatic tracing with one line
mlflow.openai.autolog()

# Your existing OpenAI code works unchanged
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain MLflow Tracing in one sentence."}
    ],
    max_tokens=100,
    temperature=0.7
)

print(response.choices[0].message.content)
# All OpenAI calls are now automatically traced!

集成

每个集成都会根据创作框架/SDK 的实现自动捕获应用程序的逻辑和中间步骤。有关所有受支持的库和每个集成的详细文档的综合列表，请参阅 MLflow 跟踪集成页。

下面是一些最常用的集成快速入门示例。请记住为要使用的每个库安装必要的包（例如 pip install openai langchain langgraph anthropic dspy boto3 databricks-sdk ag2）。

最佳集成

MLflow 为许多常用的 GenAI 框架和库提供自动跟踪。下面是最常用的集成：

开放人工智能

import mlflow
import openai

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/openai-tracing-demo")

openai_client = openai.OpenAI()

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?",
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.1,
    max_tokens=100,
)

完整 OpenAI 集成指南

LangChain

import mlflow
import os

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

# Enabling autolog for LangChain will enable trace logging.
mlflow.langchain.autolog()

# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langchain-tracing-demo")

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000)

prompt_template = PromptTemplate.from_template(
    "Answer the question as if you are {person}, fully embodying their style, wit, personality, and habits of speech. "
    "Emulate their quirks and mannerisms to the best of your ability, embracing their traits—even if they aren't entirely "
    "constructive or inoffensive. The question is: {question}"
)

chain = prompt_template | llm | StrOutputParser()

# Let's test another call
chain.invoke(
    {
        "person": "Linus Torvalds",
        "question": "Can I just set everyone's access to sudo to make things easier?",
    }
)

完整 LangChain 集成指南

LangGraph

from typing import Literal

import mlflow

from langchain_core.messages import AIMessage, ToolCall
from langchain_core.outputs import ChatGeneration, ChatResult
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

# Enabling tracing for LangGraph (LangChain)
mlflow.langchain.autolog()

# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langgraph-tracing-demo")

@tool
def get_weather(city: Literal["nyc", "sf"]):
    """Use this to get weather information."""
    if city == "nyc":
        return "It might be cloudy in nyc"
    elif city == "sf":
        return "It's always sunny in sf"

llm = ChatOpenAI(model="gpt-4o-mini")
tools = [get_weather]
graph = create_react_agent(llm, tools)

# Invoke the graph
result = graph.invoke(
    {"messages": [{"role": "user", "content": "what is the weather in sf?"}]}
)

完整 LangGraph 集成指南

人类学的

import anthropic
import mlflow
import os

# Enable auto-tracing for Anthropic
mlflow.anthropic.autolog()

# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/anthropic-tracing-demo")

# Configure your API key.
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

# Use the create method to create new message.
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
    ],
)

完整人类集成指南

DSPy

import dspy
import mlflow

# Enabling tracing for DSPy
mlflow.dspy.autolog()

# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/dspy-tracing-demo")

# Define a simple ChainOfThought model and run it
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)

# Define a simple summarizer model and run it
class SummarizeSignature(dspy.Signature):
    """Given a passage, generate a summary."""

    passage: str = dspy.InputField(desc="a passage to summarize")
    summary: str = dspy.OutputField(desc="a one-line summary of the passage")

class Summarize(dspy.Module):
    def __init__(self):
        self.summarize = dspy.ChainOfThought(SummarizeSignature)

    def forward(self, passage: str):
        return self.summarize(passage=passage)

summarizer = Summarize()
summarizer(
    passage=(
        "MLflow Tracing is a feature that enhances LLM observability in your Generative AI (GenAI) applications "
        "by capturing detailed information about the execution of your application's services. Tracing provides "
        "a way to record the inputs, outputs, and metadata associated with each intermediate step of a request, "
        "enabling you to easily pinpoint the source of bugs and unexpected behaviors."
    )
)

完整 DSPy 集成指南

Databricks

import mlflow
import os
from openai import OpenAI

# Databricks Foundation Model APIs use Databricks authentication.

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/databricks-sdk-autolog-example")

# Enable auto-tracing for OpenAI (which will trace Databricks Foundation Model API calls)
mlflow.openai.autolog()

# Create OpenAI client configured for Databricks
client = OpenAI(
    api_key=os.environ.get("DATABRICKS_TOKEN"),
    base_url=f"{os.environ.get('DATABRICKS_HOST')}/serving-endpoints"
)

# Query Llama 4 Maverick using OpenAI client
response = client.chat.completions.create(
    model="databricks-llama-4-maverick",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What are the key features of MLflow Tracing?"}
    ],
    max_tokens=150,
    temperature=0.7
)

print(response.choices[0].message.content)
# Your calls to Databricks Foundation Model APIs are automatically traced!

完整 Databricks 集成指南

基岩

import boto3
import mlflow

# Enable auto-tracing for Amazon Bedrock
mlflow.bedrock.autolog()

# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/bedrock-tracing-demo")

# Create a boto3 client for invoking the Bedrock API
bedrock = boto3.client(
    service_name="bedrock-runtime",
    region_name="<REPLACE_WITH_YOUR_AWS_REGION>",
)

# MLflow will log a trace for Bedrock API call
response = bedrock.converse(
    modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[
        {
            "role": "user",
            "content": "Describe the purpose of a 'hello world' program in one line.",
        }
    ],
    inferenceConfig={
        "maxTokens": 512,
        "temperature": 0.1,
        "topP": 0.9,
    },
)

完整基岩集成指南

AutoGen

import os
from typing import Annotated, Literal

from autogen import ConversableAgent

import mlflow

# Turn on auto tracing for AutoGen
mlflow.autogen.autolog()

# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/autogen-tracing-demo")

# Define a simple multi-agent workflow using AutoGen
config_list = [
    {
        "model": "gpt-4o-mini",
        # Please set your OpenAI API Key to the OPENAI_API_KEY env var before running this example
        "api_key": os.environ.get("OPENAI_API_KEY"),
    }
]

Operator = Literal["+", "-", "*", "/"]

def calculator(a: int, b: int, operator: Annotated[Operator, "operator"]) -> int:
    if operator == "+":
        return a + b
    elif operator == "-":
        return a - b
    elif operator == "*":
        return a * b
    elif operator == "/":
        return int(a / b)
    else:
        raise ValueError("Invalid operator")

# First define the assistant agent that suggests tool calls.
assistant = ConversableAgent(
    name="Assistant",
    system_message="You are a helpful AI assistant. "
    "You can help with simple calculations. "
    "Return 'TERMINATE' when the task is done.",
    llm_config={"config_list": config_list},
)

# The user proxy agent is used for interacting with the assistant agent
# and executes tool calls.
user_proxy = ConversableAgent(
    name="Tool Agent",
    llm_config=False,
    is_termination_msg=lambda msg: msg.get("content") is not None
    and "TERMINATE" in msg["content"],
    human_input_mode="NEVER",
)

# Register the tool signature with the assistant agent.
assistant.register_for_llm(name="calculator", description="A simple calculator")(
    calculator
)
user_proxy.register_for_execution(name="calculator")(calculator)
response = user_proxy.initiate_chat(
    assistant, message="What is (44231 + 13312 / (230 - 20)) * 4?"
)

完整 AutoGen 集成指南

组合手动跟踪和自动跟踪

@mlflow.trace修饰器可与自动跟踪结合使用，以创建功能强大的集成跟踪。这特别适用于：

涉及多个 LLM 调用的复杂工作流
多代理系统，其中不同的代理使用不同的 LLM 提供程序
将多个 LLM 调用按顺序链接，并在每个调用之间加入自定义逻辑

基本示例

下面是将 OpenAI 自动跟踪与手动定义的范围相结合的简单示例：

import mlflow
import openai
from mlflow.entities import SpanType

mlflow.openai.autolog()


@mlflow.trace(span_type=SpanType.CHAIN)
def run(question):
    messages = build_messages(question)
    # MLflow automatically generates a span for OpenAI invocation
    response = openai.OpenAI().chat.completions.create(
        model="gpt-4o-mini",
        max_tokens=100,
        messages=messages,
    )
    return parse_response(response)


@mlflow.trace
def build_messages(question):
    return [
        {"role": "system", "content": "You are a helpful chatbot."},
        {"role": "user", "content": question},
    ]


@mlflow.trace
def parse_response(response):
    return response.choices[0].message.content


run("What is MLflow?")

运行此代码将生成一个跟踪记录，该记录将手动跨度与自动 OpenAI 跟踪相结合。

自动跟踪和手动跟踪的组合

高级示例：多个 LLM 调用

对于更复杂的工作流，可以将多个 LLM 调用合并到单个跟踪中。下面是演示此模式的示例：

import mlflow
import openai
from mlflow.entities import SpanType

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

@mlflow.trace(span_type=SpanType.CHAIN)
def process_user_query(query: str):
    # First LLM call: Analyze the query
    analysis = openai.OpenAI().chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Analyze the user's query and determine if it requires factual information or creative writing."},
            {"role": "user", "content": query}
        ]
    )
    analysis_result = analysis.choices[0].message.content

    # Second LLM call: Generate response based on analysis
    if "factual" in analysis_result.lower():
        # Use a different model for factual queries
        response = openai.OpenAI().chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": "Provide a factual, well-researched response."},
                {"role": "user", "content": query}
            ]
        )
    else:
        # Use a different model for creative queries
        response = openai.OpenAI().chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": "Provide a creative, engaging response."},
                {"role": "user", "content": query}
            ]
        )

    return response.choices[0].message.content

# Run the function
result = process_user_query("Tell me about the history of artificial intelligence")

此示例创建一个单个的跟踪，如以下所示：

整个 process_user_query 函数的父范围
OpenAI 自动日志记录自动生成的两个子区段：
- 一个用于分析 LLM 调用
- 用于响应 LLM 调用的一个请求

多框架示例

还可以将不同的 LLM 提供商组合到单个跟踪中。例如：

注释

除了基本要求外，此示例还需要安装 LangChain：

%pip install --upgrade langchain langchain-openai

import mlflow
import openai
from mlflow.entities import SpanType
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Enable auto-tracing for both OpenAI and LangChain
mlflow.openai.autolog()
mlflow.langchain.autolog()

@mlflow.trace(span_type=SpanType.CHAIN)
def multi_provider_workflow(query: str):
    # First, use OpenAI directly for initial processing
    analysis = openai.OpenAI().chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Analyze the query and extract key topics."},
            {"role": "user", "content": query}
        ]
    )
    topics = analysis.choices[0].message.content

    # Then use LangChain for structured processing
    llm = ChatOpenAI(model="gpt-4o-mini")
    prompt = ChatPromptTemplate.from_template(
        "Based on these topics: {topics}\nGenerate a detailed response to: {query}"
    )
    chain = prompt | llm
    response = chain.invoke({"topics": topics, "query": query})

    return response

# Run the function
result = multi_provider_workflow("Explain quantum computing")

此示例演示如何组合：

直接 OpenAI API 调用
LangChain链工具
调用间的自定义逻辑

所有这些作都是在单个跟踪中捕获的，因此可以轻松：

调试问题
监控性能
了解请求流程
跟踪系统中哪些部分在使用

跟踪可视化效果将显示跨度的完整层次结构，明确了不同组件之间的交互方式以及每个步骤所花费的时间。

后续步骤

继续您的旅程，并参考这些推荐的行动和教程。

使用修饰器进行手动跟踪 - 添加自定义范围以捕获业务逻辑以及自动跟踪 LLM 调用
调试和观察应用 - 使用跟踪 UI 分析应用程序的行为和性能
评估应用质量 - 利用跟踪系统地评估和提高应用程序质量

参考指南

浏览本指南中提到的概念和功能的详细文档。

所有集成 - 浏览所有 20 多个受支持的库和框架
跟踪概念 - 了解 MLflow 跟踪的基础知识
跟踪数据模型 - 了解跟踪、范围和属性

通过