Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
MLflow Tracing is integrated with various GenAI libraries and provides one-line automatic tracing experience for each library (and the combination of them!). This page shows detailed examples to integrate MLflow with popular GenAI libraries.
Prerequisites
MLflow 3
This guide requires the following packages:
- mlflow[databricks]>=3.1: Core MLflow functionality with GenAI features and Databricks connectivity.
- openai>=1.0.0: Only required to run the Basic Automatic Tracing Example on this page (if using other LLM providers, install their respective SDKs instead)
- Additional libraries: Install specific libraries for the integrations you want to use
Install the basic requirements:
%pip install --upgrade "mlflow[databricks]>=3.1" openai>=1.0.0
MLflow 2.x
This guide requires the following packages:
- mlflow[databricks]>=2.15.0,<3.0.0: Core MLflow functionality with Databricks connectivity.
- openai>=1.0.0: Only required to run the Basic Automatic Tracing Example on this page (if using other LLM providers, install their respective SDKs instead)
- Additional libraries: Install specific libraries for the integrations you want to use
Install the basic requirements:
%pip install --upgrade "mlflow[databricks]>=2.15.0,<3.0.0" openai>=1.0.0
Note
While automatic tracing features are available in MLflow 2.15.0+, it is strongly recommended to install MLflow 3 (specifically 3.1 or newer if using mlflow[databricks]
) for the latest GenAI capabilities, including expanded tracing features and robust support.
Tip
Running in a Databricks notebook? MLflow is pre-installed in the Databricks runtime. You only need to install additional packages for the specific libraries you want to trace.
Running locally? You'll need to install all packages listed above plus any additional integration libraries.
Prerequisites for Databricks Setup
Before running any of the examples below, make sure you have MLflow tracking configured for Databricks:
For users outside Databricks notebooks
If you're running outside of Databricks notebooks, set your environment variables:
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="your-personal-access-token"
For users inside Databricks notebooks
If you're running inside a Databricks notebook, these credentials are automatically set for you. You only need to configure your LLM provider API keys.
LLM Provider API Keys
Set your API keys for the LLM providers you plan to use:
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export MISTRAL_API_KEY="your-mistral-api-key"
# Add other provider keys as needed
Basic Automatic Tracing Example
Here's how to enable automatic tracing for OpenAI in just one line:
import mlflow
from openai import OpenAI
import os
# Set up MLflow tracking
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/automatic-tracing-demo")
# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
# Enable automatic tracing with one line
mlflow.openai.autolog()
# Your existing OpenAI code works unchanged
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain MLflow Tracing in one sentence."}
],
max_tokens=100,
temperature=0.7
)
print(response.choices[0].message.content)
# All OpenAI calls are now automatically traced!
Integrations
Each integration automatically captures your application's logic and intermediate steps based on your implementation of the authoring framework / SDK. For a comprehensive list of all supported libraries and detailed documentation for each integration, please see the MLflow Tracing Integrations page.
Below are quick-start examples for some of the most popular integrations. Remember to install the necessary packages for each library you intend to use (e.g., pip install openai langchain langgraph anthropic dspy boto3 databricks-sdk ag2
).
Top Integrations
MLflow provides automatic tracing for many popular GenAI frameworks and libraries. Here are the most commonly used integrations:
OpenAI
import mlflow
import openai
# Enable auto-tracing for OpenAI
mlflow.openai.autolog()
# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/openai-tracing-demo")
openai_client = openai.OpenAI()
messages = [
{
"role": "user",
"content": "What is the capital of France?",
}
]
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.1,
max_tokens=100,
)
LangChain
import mlflow
import os
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
# Enabling autolog for LangChain will enable trace logging.
mlflow.langchain.autolog()
# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langchain-tracing-demo")
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000)
prompt_template = PromptTemplate.from_template(
"Answer the question as if you are {person}, fully embodying their style, wit, personality, and habits of speech. "
"Emulate their quirks and mannerisms to the best of your ability, embracing their traits—even if they aren't entirely "
"constructive or inoffensive. The question is: {question}"
)
chain = prompt_template | llm | StrOutputParser()
# Let's test another call
chain.invoke(
{
"person": "Linus Torvalds",
"question": "Can I just set everyone's access to sudo to make things easier?",
}
)
Full LangChain Integration Guide
LangGraph
from typing import Literal
import mlflow
from langchain_core.messages import AIMessage, ToolCall
from langchain_core.outputs import ChatGeneration, ChatResult
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
# Enabling tracing for LangGraph (LangChain)
mlflow.langchain.autolog()
# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langgraph-tracing-demo")
@tool
def get_weather(city: Literal["nyc", "sf"]):
"""Use this to get weather information."""
if city == "nyc":
return "It might be cloudy in nyc"
elif city == "sf":
return "It's always sunny in sf"
llm = ChatOpenAI(model="gpt-4o-mini")
tools = [get_weather]
graph = create_react_agent(llm, tools)
# Invoke the graph
result = graph.invoke(
{"messages": [{"role": "user", "content": "what is the weather in sf?"}]}
)
Full LangGraph Integration Guide
Anthropic
import anthropic
import mlflow
import os
# Enable auto-tracing for Anthropic
mlflow.anthropic.autolog()
# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/anthropic-tracing-demo")
# Configure your API key.
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
# Use the create method to create new message.
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
],
)
Full Anthropic Integration Guide
DSPy
import dspy
import mlflow
# Enabling tracing for DSPy
mlflow.dspy.autolog()
# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/dspy-tracing-demo")
# Define a simple ChainOfThought model and run it
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)
# Define a simple summarizer model and run it
class SummarizeSignature(dspy.Signature):
"""Given a passage, generate a summary."""
passage: str = dspy.InputField(desc="a passage to summarize")
summary: str = dspy.OutputField(desc="a one-line summary of the passage")
class Summarize(dspy.Module):
def __init__(self):
self.summarize = dspy.ChainOfThought(SummarizeSignature)
def forward(self, passage: str):
return self.summarize(passage=passage)
summarizer = Summarize()
summarizer(
passage=(
"MLflow Tracing is a feature that enhances LLM observability in your Generative AI (GenAI) applications "
"by capturing detailed information about the execution of your application's services. Tracing provides "
"a way to record the inputs, outputs, and metadata associated with each intermediate step of a request, "
"enabling you to easily pinpoint the source of bugs and unexpected behaviors."
)
)
Databricks
import mlflow
import os
from openai import OpenAI
# Databricks Foundation Model APIs use Databricks authentication.
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/databricks-sdk-autolog-example")
# Enable auto-tracing for OpenAI (which will trace Databricks Foundation Model API calls)
mlflow.openai.autolog()
# Create OpenAI client configured for Databricks
client = OpenAI(
api_key=os.environ.get("DATABRICKS_TOKEN"),
base_url=f"{os.environ.get('DATABRICKS_HOST')}/serving-endpoints"
)
# Query Llama 4 Maverick using OpenAI client
response = client.chat.completions.create(
model="databricks-llama-4-maverick",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What are the key features of MLflow Tracing?"}
],
max_tokens=150,
temperature=0.7
)
print(response.choices[0].message.content)
# Your calls to Databricks Foundation Model APIs are automatically traced!
Full Databricks Integration Guide
Bedrock
import boto3
import mlflow
# Enable auto-tracing for Amazon Bedrock
mlflow.bedrock.autolog()
# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/bedrock-tracing-demo")
# Create a boto3 client for invoking the Bedrock API
bedrock = boto3.client(
service_name="bedrock-runtime",
region_name="<REPLACE_WITH_YOUR_AWS_REGION>",
)
# MLflow will log a trace for Bedrock API call
response = bedrock.converse(
modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
messages=[
{
"role": "user",
"content": "Describe the purpose of a 'hello world' program in one line.",
}
],
inferenceConfig={
"maxTokens": 512,
"temperature": 0.1,
"topP": 0.9,
},
)
Full Bedrock Integration Guide
AutoGen
import os
from typing import Annotated, Literal
from autogen import ConversableAgent
import mlflow
# Turn on auto tracing for AutoGen
mlflow.autogen.autolog()
# Set up MLflow tracking on Databricks
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/autogen-tracing-demo")
# Define a simple multi-agent workflow using AutoGen
config_list = [
{
"model": "gpt-4o-mini",
# Please set your OpenAI API Key to the OPENAI_API_KEY env var before running this example
"api_key": os.environ.get("OPENAI_API_KEY"),
}
]
Operator = Literal["+", "-", "*", "/"]
def calculator(a: int, b: int, operator: Annotated[Operator, "operator"]) -> int:
if operator == "+":
return a + b
elif operator == "-":
return a - b
elif operator == "*":
return a * b
elif operator == "/":
return int(a / b)
else:
raise ValueError("Invalid operator")
# First define the assistant agent that suggests tool calls.
assistant = ConversableAgent(
name="Assistant",
system_message="You are a helpful AI assistant. "
"You can help with simple calculations. "
"Return 'TERMINATE' when the task is done.",
llm_config={"config_list": config_list},
)
# The user proxy agent is used for interacting with the assistant agent
# and executes tool calls.
user_proxy = ConversableAgent(
name="Tool Agent",
llm_config=False,
is_termination_msg=lambda msg: msg.get("content") is not None
and "TERMINATE" in msg["content"],
human_input_mode="NEVER",
)
# Register the tool signature with the assistant agent.
assistant.register_for_llm(name="calculator", description="A simple calculator")(
calculator
)
user_proxy.register_for_execution(name="calculator")(calculator)
response = user_proxy.initiate_chat(
assistant, message="What is (44231 + 13312 / (230 - 20)) * 4?"
)
Full AutoGen Integration Guide
Combining Manual and Automatic Tracing
The @mlflow.trace
decorator can be used in conjunction with auto tracing to create powerful, integrated traces. This is particularly useful for:
- Complex workflows that involve multiple LLM calls
- Multi-agent systems where different agents use different LLM providers
- Chaining multiple LLM calls together with custom logic in between
Basic Example
Here's a simple example that combines OpenAI auto-tracing with manually defined spans:
import mlflow
import openai
from mlflow.entities import SpanType
mlflow.openai.autolog()
@mlflow.trace(span_type=SpanType.CHAIN)
def run(question):
messages = build_messages(question)
# MLflow automatically generates a span for OpenAI invocation
response = openai.OpenAI().chat.completions.create(
model="gpt-4o-mini",
max_tokens=100,
messages=messages,
)
return parse_response(response)
@mlflow.trace
def build_messages(question):
return [
{"role": "system", "content": "You are a helpful chatbot."},
{"role": "user", "content": question},
]
@mlflow.trace
def parse_response(response):
return response.choices[0].message.content
run("What is MLflow?")
Running this code generates a single trace that combines the manual spans with the automatic OpenAI tracing:
Advanced Example: Multiple LLM Calls
For more complex workflows, you can combine multiple LLM calls into a single trace. Here's an example that demonstrates this pattern:
import mlflow
import openai
from mlflow.entities import SpanType
# Enable auto-tracing for OpenAI
mlflow.openai.autolog()
@mlflow.trace(span_type=SpanType.CHAIN)
def process_user_query(query: str):
# First LLM call: Analyze the query
analysis = openai.OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Analyze the user's query and determine if it requires factual information or creative writing."},
{"role": "user", "content": query}
]
)
analysis_result = analysis.choices[0].message.content
# Second LLM call: Generate response based on analysis
if "factual" in analysis_result.lower():
# Use a different model for factual queries
response = openai.OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Provide a factual, well-researched response."},
{"role": "user", "content": query}
]
)
else:
# Use a different model for creative queries
response = openai.OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Provide a creative, engaging response."},
{"role": "user", "content": query}
]
)
return response.choices[0].message.content
# Run the function
result = process_user_query("Tell me about the history of artificial intelligence")
This example creates a single trace that includes:
- A parent span for the entire
process_user_query
function - Two child spans automatically created by the OpenAI autologging:
- One for the analysis LLM call
- One for the response LLM call
Multi-Framework Example
You can also combine different LLM providers in a single trace. For example:
Note
This example requires installing LangChain in addition to the base requirements:
%pip install --upgrade langchain langchain-openai
import mlflow
import openai
from mlflow.entities import SpanType
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
# Enable auto-tracing for both OpenAI and LangChain
mlflow.openai.autolog()
mlflow.langchain.autolog()
@mlflow.trace(span_type=SpanType.CHAIN)
def multi_provider_workflow(query: str):
# First, use OpenAI directly for initial processing
analysis = openai.OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Analyze the query and extract key topics."},
{"role": "user", "content": query}
]
)
topics = analysis.choices[0].message.content
# Then use LangChain for structured processing
llm = ChatOpenAI(model="gpt-4o-mini")
prompt = ChatPromptTemplate.from_template(
"Based on these topics: {topics}\nGenerate a detailed response to: {query}"
)
chain = prompt | llm
response = chain.invoke({"topics": topics, "query": query})
return response
# Run the function
result = multi_provider_workflow("Explain quantum computing")
This example shows how to combine:
- Direct OpenAI API calls
- LangChain chains
- Custom logic between the calls
All of this is captured in a single trace, making it easy to:
- Debug issues
- Monitor performance
- Understand the flow of the request
- Track which parts of the system are being used
The trace visualization will show the complete hierarchy of spans, making it clear how the different components interact and how long each step takes.
Next steps
Continue your journey with these recommended actions and tutorials.
- Manual tracing with decorators - Add custom spans to capture business logic alongside auto-traced LLM calls
- Debug and observe your app - Use the Trace UI to analyze your application's behavior and performance
- Evaluate app quality - Leverage your traces to systematically assess and improve application quality
Reference guides
Explore detailed documentation for concepts and features mentioned in this guide.
- All integrations - Browse all 20+ supported libraries and frameworks
- Tracing concepts - Understand the fundamentals of MLflow Tracing
- Tracing data model - Learn about traces, spans, and attributes