Version Tracking API Reference

2025-06-11

Overview

MLflow version tracking enables you to create versioned representations of your GenAI applications using the LoggedModel entity. This page provides the API reference for tracking application versions in MLflow.

Why Version Your GenAI Application?

Reproducibility: Capture or link to the exact code (e.g., Git commit hash) and configurations used for a specific version, ensuring you can always reconstruct it.

Debugging Regressions: Track LoggedModel versions to easily compare problematic versions against known good versions by examining differences in code, configurations, evaluation results, and traces.

Objective Comparison: Systematically evaluate versions using mlflow.genai.evaluate() to compare metrics like quality scores, cost, and latency side-by-side.

Auditability: Each LoggedModel version serves as an auditable record, linking to specific code and configurations for compliance and incident investigation.

Core Concepts

LoggedModel

A LoggedModel in MLflow represents a specific version of your GenAI application. Each distinct state of your application that you want to evaluate, deploy, or refer back to can be captured as a new LoggedModel.

Key characteristics:

Central versioned entity for your GenAI application
Captures application state including configuration and parameters
Links to external code (typically via Git commit hash)
Tracks lifecycle from development through production

Version Tracking Methods

MLflow provides two approaches for version tracking:

set_active_model: Simple version tracking that automatically creates a LoggedModel if needed and links subsequent traces
create_external_model: Full control over version creation with extensive metadata, parameters, and tags

API Reference

set_active_model

Links traces to a specific LoggedModel version. If a model with the given name doesn't exist, it automatically creates one.

def set_active_model(
    name: Optional[str] = None,
    model_id: Optional[str] = None
) -> ActiveModel:

Parameters

Parameter	Type	Required	Description
`name`	`str \\| None`	No*	Name of the model. If model doesn't exist, creates a new one
`model_id`	`str \\| None`	No*	ID of an existing LoggedModel

*Either name or model_id must be provided.

Return Value

Returns an ActiveModel object (subclass of LoggedModel) that can be used as a context manager.

Example Usage

import mlflow

# Simple usage - creates model if it doesn't exist
mlflow.set_active_model(name="my-agent-v1.0")

# Use as context manager
with mlflow.set_active_model(name="my-agent-v2.0") as model:
    print(f"Model ID: {model.model_id}")
    # Traces within this context are linked to this model

# Use with existing model ID
mlflow.set_active_model(model_id="existing-model-id")

create_external_model

Creates a new LoggedModel for applications whose code and artifacts are stored outside MLflow (e.g., in Git).

def create_external_model(
    name: Optional[str] = None,
    source_run_id: Optional[str] = None,
    tags: Optional[dict[str, str]] = None,
    params: Optional[dict[str, str]] = None,
    model_type: Optional[str] = None,
    experiment_id: Optional[str] = None,
) -> LoggedModel:

Parameters

Parameter	Type	Required	Description
`name`	`str \\| None`	No	Model name. If not specified, a random name is generated
`source_run_id`	`str \\| None`	No	ID of the associated run. Defaults to active run ID if within a run context
`tags`	`dict[str, str] \\| None`	No	Key-value pairs for organization and filtering
`params`	`dict[str, str] \\| None`	No	Model parameters and configuration (must be strings)
`model_type`	`str \\| None`	No	User-defined type for categorization (e.g., "agent", "rag-system")
`experiment_id`	`str \\| None`	No	Experiment to associate with. Uses active experiment if not specified

Return Value

Returns a LoggedModel object with:

model_id: Unique identifier for the model
name: The assigned model name
experiment_id: Associated experiment ID
creation_timestamp: When the model was created
status: Model status (always "READY" for external models)
tags: Dictionary of tags
params: Dictionary of parameters

Example Usage

import mlflow

# Basic usage
model = mlflow.create_external_model(
    name="customer-support-agent-v1.0"
)

# With full metadata
model = mlflow.create_external_model(
    name="recommendation-engine-v2.1",
    model_type="rag-agent",
    params={
        "llm_model": "gpt-4",
        "temperature": "0.7",
        "max_tokens": "1000",
        "retrieval_k": "5"
    },
    tags={
        "team": "ml-platform",
        "environment": "staging",
        "git_commit": "abc123def"
    }
)

# Within a run context
with mlflow.start_run() as run:
    model = mlflow.create_external_model(
        name="my-agent-v3.0",
        source_run_id=run.info.run_id
    )

LoggedModel Class

The LoggedModel class represents a versioned model in MLflow.

Properties

Property	Type	Description
`model_id`	`str`	Unique identifier for the model
`name`	`str`	Model name
`experiment_id`	`str`	Associated experiment ID
`creation_timestamp`	`int`	Creation time (milliseconds since epoch)
`last_updated_timestamp`	`int`	Last update time (milliseconds since epoch)
`model_type`	`str \\| None`	User-defined model type
`source_run_id`	`str \\| None`	ID of the run that created this model
`status`	`LoggedModelStatus`	Model status (READY, FAILED_REGISTRATION, etc.)
`tags`	`dict[str, str]`	Dictionary of tags
`params`	`dict[str, str]`	Dictionary of parameters
`model_uri`	`str`	URI for referencing the model (e.g., "models:/model_id")

Common Patterns

Version Tracking with Git Integration

import mlflow
import subprocess

# Get current git commit
git_commit = subprocess.check_output(["git", "rev-parse", "HEAD"]).decode().strip()[:8]

# Create versioned model name
model_name = f"my-app-git-{git_commit}"

# Track the version
model = mlflow.create_external_model(
    name=model_name,
    tags={"git_commit": git_commit}
)

Linking Traces to Versions

import mlflow

# Set active model - all subsequent traces will be linked
mlflow.set_active_model(name="my-agent-v1.0")

# Your application code with tracing
@mlflow.trace
def process_request(query: str):
    # This trace will be automatically linked to my-agent-v1.0
    return f"Processing: {query}"

# Run the application
result = process_request("Hello world")

Production Deployment

In production, use environment variables instead of calling set_active_model():

# Set the model ID that traces should be linked to
export MLFLOW_ACTIVE_MODEL_ID="my-agent-v1.0"

Best Practices

Use semantic versioning in model names (e.g., "app-v1.2.3")
Include git commits in tags for traceability
Parameters must be strings - convert numbers and booleans
Use model_type to categorize similar applications
Set active model before tracing to ensure proper linkage

Common Issues

Invalid parameter types:

# Error: Parameters must be strings
# Wrong:
params = {"temperature": 0.7, "max_tokens": 1000}

# Correct:
params = {"temperature": "0.7", "max_tokens": "1000"}

Next Steps

Track application versions - Step-by-step guide to version your GenAI app
Link production traces - Connect production data to app versions
Package for deployment - Deploy versioned apps to Model Serving

Share via

Version Tracking API Reference

Overview

Why Version Your GenAI Application?

Core Concepts

LoggedModel

Version Tracking Methods

API Reference

set_active_model

Parameters

Return Value

Example Usage

create_external_model

Parameters

Return Value

Example Usage

LoggedModel Class

Properties

Common Patterns

Version Tracking with Git Integration

Linking Traces to Versions

Production Deployment

Best Practices

Common Issues

Next Steps

Feedback

Additional resources