Share via


Track users & sessions

Tracking users and sessions in your GenAI application provides essential context for understanding user behavior, analyzing conversation flows, and improving personalization. MLflow offers built-in support for associating traces with users and grouping them into sessions.

Prerequisites

Choose the appropriate installation method based on your environment:

Production

For production deployments, install the mlflow-tracing package:

pip install --upgrade mlflow-tracing

The mlflow-tracing package is optimized for production use with minimal dependencies and better performance characteristics.

Development

For development environments, install the full MLflow package with Databricks extras:

pip install --upgrade "mlflow[databricks]>=3.1"

The full mlflow[databricks] package includes all features needed for local development and experimentation on Databricks.

Note

MLflow 3 is required for user and session tracking. MLflow 2.x is not supported due to performance limitations and missing features essential for production use.

Why track users and sessions?

User and session tracking enables powerful analytics and improvements:

  1. User behavior analysis - Understand how different users interact with your application
  2. Conversation flow tracking - Analyze multi-turn conversations and context retention
  3. Personalization insights - Identify patterns to improve user-specific experiences
  4. Quality per user - Track performance metrics across different user segments
  5. Session continuity - Maintain context across multiple interactions

Standard MLflow metadata fields

MLflow provides two standard metadata fields for session and user tracking:

  • mlflow.trace.user - Associates traces with specific users
  • mlflow.trace.session - Groups traces belonging to multi-turn conversations

When you use these standard metadata fields, MLflow automatically enables filtering and grouping in the UI. Unlike tags, metadata cannot be updated once the trace is logged, making it ideal for immutable identifiers like user and session IDs.

Basic implementation

Here's how to add user and session tracking to your application:

import mlflow

@mlflow.trace
def chat_completion(user_id: str, session_id: str, message: str):
    """Process a chat message with user and session tracking."""

    # Add user and session context to the current trace
    # The @mlflow.trace decorator ensures there's an active trace
    mlflow.update_current_trace(
        metadata={
            "mlflow.trace.user": user_id,      # Links this trace to a specific user
            "mlflow.trace.session": session_id, # Groups this trace with others in the same conversation
        }
    )

    # Your chat logic here
    # The trace will capture the execution time, inputs, outputs, and any errors
    response = generate_response(message)
    return response

# Example usage in a chat application
def handle_user_message(request):
    # Extract user and session IDs from your application's context
    # These IDs should be consistent across all interactions
    return chat_completion(
        user_id=request.user_id,        # e.g., "user-123" - unique identifier for the user
        session_id=request.session_id,   # e.g., "session-abc-456" - groups related messages
        message=request.message
    )

Key points:

  • The @mlflow.trace decorator automatically creates a trace for the function execution
  • mlflow.update_current_trace() adds the user ID and session ID as metadata to the active trace
  • Using metadata ensures these identifiers are immutable once the trace is created

Production web application example

In production applications, you typically track user, session, and other contextual information simultaneously. The following example is adapted from our Production Observability with Tracing guide and also incorporates environment and deployment context as shown in the Track Environments & Context guide:

import mlflow
import os
from fastapi import FastAPI, Request, HTTPException # HTTPException might be needed depending on full app logic
from pydantic import BaseModel

# Initialize FastAPI app
app = FastAPI()

class ChatRequest(BaseModel):
    message: str

@mlflow.trace # Ensure @mlflow.trace is the outermost decorator
@app.post("/chat") # FastAPI decorator should be inner
def handle_chat(request: Request, chat_request: ChatRequest):
    # Retrieve all context from request headers
    session_id = request.headers.get("X-Session-ID")
    user_id = request.headers.get("X-User-ID")

    # Update the current trace with all context and environment metadata
    # The @mlflow.trace decorator ensures an active trace is available
    mlflow.update_current_trace(
        client_request_id=client_request_id,
        metadata={
            # Session context - groups traces from multi-turn conversations
            "mlflow.trace.session": session_id,
            # User context - associates traces with specific users
            "mlflow.trace.user": user_id,

        }
    )

    # --- Your application logic for processing the chat message ---
    # For example, calling a language model with context
    # response_text = my_llm_call(
    #     message=chat_request.message,
    #     session_id=session_id,
    #     user_id=user_id
    # )
    response_text = f"Processed message: '{chat_request.message}'"
    # --- End of application logic ---

    # Return response
    return {
        "response": response_text
    }

# To run this example (requires uvicorn and fastapi):
# uvicorn your_file_name:app --reload
#
# Example curl request with context headers:
# curl -X POST "http://127.0.0.1:8000/chat" \
#      -H "Content-Type: application/json" \
#      -H "X-Request-ID: req-abc-123-xyz-789" \
#      -H "X-Session-ID: session-def-456-uvw-012" \
#      -H "X-User-ID: user-jane-doe-12345" \
#      -d '{"message": "What is my account balance?"}'

This example demonstrates a unified approach to context tracking, capturing:

  • User Information: From the X-User-ID header, logged as mlflow.trace.user metadata.
  • Session Information: From the X-Session-ID header, logged as mlflow.trace.session metadata.

Querying and analyzing data

Using the MLflow UI

Filter traces in the MLflow UI using these search queries:

# Find all traces for a specific user
metadata.`mlflow.trace.user` = 'user-123'

# Find all traces in a session
metadata.`mlflow.trace.session` = 'session-abc-456'

# Find traces for a user within a specific session
metadata.`mlflow.trace.user` = 'user-123' AND metadata.`mlflow.trace.session` = 'session-abc-456'

Programmatic analysis

Use the MLflow SDK to analyze user and session data programmatically. This enables you to build custom analytics, generate reports, and monitor user behavior patterns at scale.

from mlflow.client import MlflowClient

client = MlflowClient()

# Analyze user behavior
def analyze_user_behavior(user_id: str, experiment_id: str):
    """Analyze a specific user's interaction patterns."""

    # Search for all traces from a specific user
    user_traces = client.search_traces(
        experiment_ids=[experiment_id],
        filter_string=f"metadata.`mlflow.trace.user` = '{user_id}'",
        max_results=1000
    )

    # Calculate key metrics
    total_interactions = len(user_traces)
    unique_sessions = len(set(t.info.metadata.get("mlflow.trace.session", "") for t in user_traces))
    avg_response_time = sum(t.info.execution_time_ms for t in user_traces) / total_interactions

    return {
        "total_interactions": total_interactions,
        "unique_sessions": unique_sessions,
        "avg_response_time": avg_response_time
    }

# Analyze session flow
def analyze_session_flow(session_id: str, experiment_id: str):
    """Analyze conversation flow within a session."""

    # Get all traces from a session, ordered chronologically
    session_traces = client.search_traces(
        experiment_ids=[experiment_id],
        filter_string=f"metadata.`mlflow.trace.session` = '{session_id}'",
        order_by=["timestamp ASC"]
    )

    # Build a timeline of the conversation
    conversation_turns = []
    for i, trace in enumerate(session_traces):
        conversation_turns.append({
            "turn": i + 1,
            "timestamp": trace.info.timestamp,
            "duration_ms": trace.info.execution_time_ms,
            "status": trace.info.status
        })

    return conversation_turns

Key capabilities:

  • User behavior analysis - Track interaction frequency, session count, and performance metrics per user
  • Session flow analysis - Reconstruct conversation timelines to understand multi-turn interactions
  • Flexible filtering - Use MLflow's search syntax to query traces by any combination of metadata fields
  • Scalable analysis - Process thousands of traces programmatically for large-scale insights
  • Export-ready data - Results can be easily converted to DataFrames or exported for further analysis

Best practices

  1. Consistent ID formats - Use standardized formats for user and session IDs
  2. Session boundaries - Define clear rules for when sessions start and end
  3. Metadata enrichment - Add additional context like user segments or session types
  4. Combine with request tracking - Link user/session data with request IDs for complete traceability
  5. Regular analysis - Set up dashboards to monitor user behavior and session patterns

Integration with other MLflow features

User and session tracking integrates seamlessly with other MLflow capabilities:

Production considerations

For comprehensive production implementation, see our guide on production observability with tracing which covers:

  • Setting up user and session tracking in production environments
  • Combining session IDs with request IDs for complete traceability
  • Implementing feedback collection for entire sessions
  • Best practices for high-volume session management

Next steps

Continue your journey with these recommended actions and tutorials.

Reference guides

Explore detailed documentation for concepts and features mentioned in this guide.