DSPy 是一个开源框架,用于构建模块化 AI 系统,并提供用于优化提示和权重的算法。
MLflow 跟踪 为 DSPy 提供自动跟踪功能。 可以通过调用 mlflow.dspy.autolog
函数来启用 DSPy 跟踪,嵌套跟踪在调用 DSPy 模块时会自动记录到活动的 MLflow 试验中。
import mlflow
mlflow.dspy.autolog()
先决条件
若要将 MLflow 跟踪与 DSPy 配合使用,需要安装 MLflow 和 dspy-ai
库。
开发
对于开发环境,请安装带有 Databricks 附加组件的完整 MLflow 包和 dspy-ai
。
pip install --upgrade "mlflow[databricks]>=3.1" dspy-ai
完整 mlflow[databricks]
包包括用于 Databricks 的本地开发和试验的所有功能。
生产
对于生产部署,请安装 mlflow-tracing
和 dspy-ai
:
pip install --upgrade mlflow-tracing dspy-ai
包 mlflow-tracing
已针对生产用途进行优化。
注释
对于 DSPy 的最佳追踪体验,强烈推荐使用 MLflow 3。
在运行示例之前,需要配置环境:
对于不使用 Databricks 笔记本的用户:设置 Databricks 环境变量:
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-personal-access-token"
对于 Databricks 笔记本中的用户:这些凭据会自动为您设置。
API 密钥:确保设置 LLM 提供程序 API 密钥:
export OPENAI_API_KEY="your-openai-api-key"
# Add other provider keys as needed
示例用法
import dspy
import mlflow
import os
# Ensure your OPENAI_API_KEY (or other LLM provider keys) is set in your environment
# os.environ["OPENAI_API_KEY"] = "your-openai-api-key" # Uncomment and set if not globally configured
# Enabling tracing for DSPy
mlflow.dspy.autolog()
# Set up MLflow tracking to Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/dspy-tracing-demo")
# Define a simple ChainOfThought model and run it
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)
# Define a simple summarizer model and run it
class SummarizeSignature(dspy.Signature):
"""Given a passage, generate a summary."""
passage: str = dspy.InputField(desc="a passage to summarize")
summary: str = dspy.OutputField(desc="a one-line summary of the passage")
class Summarize(dspy.Module):
def __init__(self):
self.summarize = dspy.ChainOfThought(SummarizeSignature)
def forward(self, passage: str):
return self.summarize(passage=passage)
summarizer = Summarize()
summarizer(
passage=(
"MLflow Tracing is a feature that enhances LLM observability in your Generative AI (GenAI) applications "
"by capturing detailed information about the execution of your application's services. Tracing provides "
"a way to record the inputs, outputs, and metadata associated with each intermediate step of a request, "
"enabling you to easily pinpoint the source of bugs and unexpected behaviors."
)
)
评估时的跟踪
评估 DSPy 模型是开发 AI 系统的重要步骤。 MLflow 跟踪可帮助你跟踪评估后程序的性能,方法是提供有关每个输入的程序执行的详细信息。
为 DSPy 启用 MLflow 自动跟踪时,在执行 DSPy 的 内置评估套件时,将自动生成跟踪。 以下示例演示如何在 MLflow 中运行评估和查看跟踪:
import dspy
from dspy.evaluate.metrics import answer_exact_match
import mlflow
import os
# Ensure your OPENAI_API_KEY (or other LLM provider keys) is set in your environment
# os.environ["OPENAI_API_KEY"] = "your-openai-api-key" # Uncomment and set if not globally configured
# Enabling tracing for DSPy evaluation
mlflow.dspy.autolog(log_traces_from_eval=True)
# Set up MLflow tracking to Databricks if not already configured
# mlflow.set_tracking_uri("databricks")
# mlflow.set_experiment("/Shared/dspy-eval-demo")
# Define a simple evaluation set
eval_set = [
dspy.Example(
question="How many 'r's are in the word 'strawberry'?", answer="3"
).with_inputs("question"),
dspy.Example(
question="How many 'a's are in the word 'banana'?", answer="3"
).with_inputs("question"),
dspy.Example(
question="How many 'e's are in the word 'elephant'?", answer="2"
).with_inputs("question"),
]
# Define a program
class Counter(dspy.Signature):
question: str = dspy.InputField()
answer: str = dspy.OutputField(
desc="Should only contain a single number as an answer"
)
cot = dspy.ChainOfThought(Counter)
# Evaluate the programs
with mlflow.start_run(run_name="CoT Evaluation"):
evaluator = dspy.evaluate.Evaluate(
devset=eval_set,
return_all_scores=True,
return_outputs=True,
show_progress=True,
)
aggregated_score, outputs, all_scores = evaluator(cot, metric=answer_exact_match)
# Log the aggregated score
mlflow.log_metric("exact_match", aggregated_score)
# Log the detailed evaluation results as a table
mlflow.log_table(
{
"question": [example.question for example in eval_set],
"answer": [example.answer for example in eval_set],
"output": outputs,
"exact_match": all_scores,
},
artifact_file="eval_results.json",
)
如果打开 MLflow UI 并进入“CoT 评估”运行,你将看到评估结果,以及在 Traces
选项卡上评估期间生成的跟踪列表。
注释
可以通过将mlflow.dspy.autolog
参数设置为log_traces_from_eval
来调用False
函数,进而禁用这些步骤的跟踪。
编译期间跟踪(优化)
编译(优化) 是 DSPy 的核心概念。 通过编译,DSPy 会自动优化 DSPy 程序的提示和权重,以实现最佳性能。
默认情况下,MLflow 不会 在复杂期间生成跟踪,因为复杂情况可能会触发数百或数千个 DSPy 模块的调用。 若要启用编译的跟踪功能,可以通过设置参数mlflow.dspy.autolog
为log_traces_from_compile
来调用True
函数。
import dspy
import mlflow
import os
# Ensure your OPENAI_API_KEY (or other LLM provider keys) is set in your environment
# os.environ["OPENAI_API_KEY"] = "your-openai-api-key" # Uncomment and set if not globally configured
# Enable auto-tracing for compilation
mlflow.dspy.autolog(log_traces_from_compile=True)
# Set up MLflow tracking to Databricks if not already configured
# mlflow.set_tracking_uri("databricks")
# mlflow.set_experiment("/Shared/dspy-compile-demo")
# Optimize the DSPy program as usual
tp = dspy.MIPROv2(metric=metric, auto="medium", num_threads=24)
optimized = tp.compile(cot, trainset=trainset, ...)
禁用自动跟踪
可以通过调用 mlflow.llama_index.autolog(disable=True)
或 mlflow.autolog(disable=True)
调用来全局禁用 LlamaIndex 的自动跟踪。