Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Overview
MLflow version tracking enables you to create versioned representations of your GenAI applications using the LoggedModel
entity. This page provides the API reference for tracking application versions in MLflow.
Why Version Your GenAI Application?
Reproducibility: Capture or link to the exact code (e.g., Git commit hash) and configurations used for a specific version, ensuring you can always reconstruct it.
Debugging Regressions: Track LoggedModel
versions to easily compare problematic versions against known good versions by examining differences in code, configurations, evaluation results, and traces.
Objective Comparison: Systematically evaluate versions using mlflow.genai.evaluate()
to compare metrics like quality scores, cost, and latency side-by-side.
Auditability: Each LoggedModel
version serves as an auditable record, linking to specific code and configurations for compliance and incident investigation.
Core Concepts
LoggedModel
A LoggedModel
in MLflow represents a specific version of your GenAI application. Each distinct state of your application that you want to evaluate, deploy, or refer back to can be captured as a new LoggedModel
.
Key characteristics:
- Central versioned entity for your GenAI application
- Captures application state including configuration and parameters
- Links to external code (typically via Git commit hash)
- Tracks lifecycle from development through production
Version Tracking Methods
MLflow provides two approaches for version tracking:
set_active_model
: Simple version tracking that automatically creates aLoggedModel
if needed and links subsequent tracescreate_external_model
: Full control over version creation with extensive metadata, parameters, and tags
API Reference
set_active_model
Links traces to a specific LoggedModel
version. If a model with the given name doesn't exist, it automatically creates one.
def set_active_model(
name: Optional[str] = None,
model_id: Optional[str] = None
) -> ActiveModel:
Parameters
Parameter | Type | Required | Description |
---|---|---|---|
name |
str \| None |
No* | Name of the model. If model doesn't exist, creates a new one |
model_id |
str \| None |
No* | ID of an existing LoggedModel |
*Either name
or model_id
must be provided.
Return Value
Returns an ActiveModel
object (subclass of LoggedModel
) that can be used as a context manager.
Example Usage
import mlflow
# Simple usage - creates model if it doesn't exist
mlflow.set_active_model(name="my-agent-v1.0")
# Use as context manager
with mlflow.set_active_model(name="my-agent-v2.0") as model:
print(f"Model ID: {model.model_id}")
# Traces within this context are linked to this model
# Use with existing model ID
mlflow.set_active_model(model_id="existing-model-id")
create_external_model
Creates a new LoggedModel
for applications whose code and artifacts are stored outside MLflow (e.g., in Git).
def create_external_model(
name: Optional[str] = None,
source_run_id: Optional[str] = None,
tags: Optional[dict[str, str]] = None,
params: Optional[dict[str, str]] = None,
model_type: Optional[str] = None,
experiment_id: Optional[str] = None,
) -> LoggedModel:
Parameters
Parameter | Type | Required | Description |
---|---|---|---|
name |
str \| None |
No | Model name. If not specified, a random name is generated |
source_run_id |
str \| None |
No | ID of the associated run. Defaults to active run ID if within a run context |
tags |
dict[str, str] \| None |
No | Key-value pairs for organization and filtering |
params |
dict[str, str] \| None |
No | Model parameters and configuration (must be strings) |
model_type |
str \| None |
No | User-defined type for categorization (e.g., "agent", "rag-system") |
experiment_id |
str \| None |
No | Experiment to associate with. Uses active experiment if not specified |
Return Value
Returns a LoggedModel
object with:
model_id
: Unique identifier for the modelname
: The assigned model nameexperiment_id
: Associated experiment IDcreation_timestamp
: When the model was createdstatus
: Model status (always "READY" for external models)tags
: Dictionary of tagsparams
: Dictionary of parameters
Example Usage
import mlflow
# Basic usage
model = mlflow.create_external_model(
name="customer-support-agent-v1.0"
)
# With full metadata
model = mlflow.create_external_model(
name="recommendation-engine-v2.1",
model_type="rag-agent",
params={
"llm_model": "gpt-4",
"temperature": "0.7",
"max_tokens": "1000",
"retrieval_k": "5"
},
tags={
"team": "ml-platform",
"environment": "staging",
"git_commit": "abc123def"
}
)
# Within a run context
with mlflow.start_run() as run:
model = mlflow.create_external_model(
name="my-agent-v3.0",
source_run_id=run.info.run_id
)
LoggedModel Class
The LoggedModel
class represents a versioned model in MLflow.
Properties
Property | Type | Description |
---|---|---|
model_id |
str |
Unique identifier for the model |
name |
str |
Model name |
experiment_id |
str |
Associated experiment ID |
creation_timestamp |
int |
Creation time (milliseconds since epoch) |
last_updated_timestamp |
int |
Last update time (milliseconds since epoch) |
model_type |
str \| None |
User-defined model type |
source_run_id |
str \| None |
ID of the run that created this model |
status |
LoggedModelStatus |
Model status (READY, FAILED_REGISTRATION, etc.) |
tags |
dict[str, str] |
Dictionary of tags |
params |
dict[str, str] |
Dictionary of parameters |
model_uri |
str |
URI for referencing the model (e.g., "models:/model_id") |
Common Patterns
Version Tracking with Git Integration
import mlflow
import subprocess
# Get current git commit
git_commit = subprocess.check_output(["git", "rev-parse", "HEAD"]).decode().strip()[:8]
# Create versioned model name
model_name = f"my-app-git-{git_commit}"
# Track the version
model = mlflow.create_external_model(
name=model_name,
tags={"git_commit": git_commit}
)
Linking Traces to Versions
import mlflow
# Set active model - all subsequent traces will be linked
mlflow.set_active_model(name="my-agent-v1.0")
# Your application code with tracing
@mlflow.trace
def process_request(query: str):
# This trace will be automatically linked to my-agent-v1.0
return f"Processing: {query}"
# Run the application
result = process_request("Hello world")
Production Deployment
In production, use environment variables instead of calling set_active_model()
:
# Set the model ID that traces should be linked to
export MLFLOW_ACTIVE_MODEL_ID="my-agent-v1.0"
Best Practices
- Use semantic versioning in model names (e.g., "app-v1.2.3")
- Include git commits in tags for traceability
- Parameters must be strings - convert numbers and booleans
- Use model_type to categorize similar applications
- Set active model before tracing to ensure proper linkage
Common Issues
Invalid parameter types:
# Error: Parameters must be strings
# Wrong:
params = {"temperature": 0.7, "max_tokens": 1000}
# Correct:
params = {"temperature": "0.7", "max_tokens": "1000"}
Next Steps
- Track application versions - Step-by-step guide to version your GenAI app
- Link production traces - Connect production data to app versions
- Package for deployment - Deploy versioned apps to Model Serving