Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article describes how to build, deploy, and run a Python wheel file as part of a Databricks Asset Bundle project. See What are Databricks Asset Bundles?.
For an example configuration that builds a JAR and uploads it to Unity Catalog, see Bundle that uploads a JAR file to Unity Catalog.
Requirements
- Databricks CLI version 0.218.0 or above is installed, and authentication is configured. To check your installed version of the Databricks CLI, run the command
databricks -v
. To install the Databricks CLI, see Install or update the Databricks CLI. To configure authentication, see Configure access to your workspace. - The remote workspace must have workspace files enabled. See What are workspace files?.
Create the bundle using a template
In these steps, you create the bundle using the Azure Databricks default bundle template for Python. This bundle consists of files to build into a Python wheel file and the definition of an Azure Databricks job to build this Python wheel file. You then validate, deploy, and build the deployed files into a Python wheel file from the Python wheel job within your Azure Databricks workspace.
Note
The Azure Databricks default bundle template for Python uses uv to build the Python wheel file. To install uv
, see Installing uv.
If you want to create a bundle from scratch, see Create a bundle manually.
Step 1: Create the bundle
A bundle contains the artifacts you want to deploy and the settings for the workflows you want to run.
Use your terminal or command prompt to switch to a directory on your local development machine that will contain the template's generated bundle.
Use the Databricks CLI version to run the
bundle init
command:databricks bundle init
For
Template to use
, leave the default value ofdefault-python
by pressingEnter
.For
Unique name for this project
, leave the default value ofmy_project
, or type a different value, and then pressEnter
. This determines the name of the root directory for this bundle. This root directory is created within your current working directory.For
Include a stub (sample) notebook
, selectno
and pressEnter
. This instructs the Databricks CLI to not add a sample notebook to your bundle.For
Include a stub (sample) Delta Live Tables pipeline
, selectno
and pressEnter
. This instructs the Databricks CLI to not define a sample pipeline in your bundle.For
Include a stub (sample) Python package
, leave the default value ofyes
by pressingEnter
. This instructs the Databricks CLI to add sample Python wheel package files and related build instructions to your bundle.For
Use serverless
, selectyes
and pressEnter
. This instructs the Databricks CLI to configure your bundle to run on serverless compute.
Step 2: Explore the bundle
To view the files that the template generated, switch to the root directory of your newly created bundle and open this directory with your preferred IDE. Files of particular interest include the following:
databricks.yml
: This file specifies the bundle's name, specifieswhl
build settings, includes a reference to the job configuration file, and defines settings for target workspaces.resources/<project-name>_job.yml
: This file specifies the Python wheel job's settings.src/<project-name>
: This directory includes the files that the Python wheel job uses to build the Python wheel file.
Note
If you want to install the Python wheel file on a cluster with Databricks Runtime 12.2 LTS or below, you must add the following top-level mapping to the databricks.yml
file:
# Applies to all tasks of type python_wheel_task.
experimental:
python_wheel_wrapper: true
Step 3: Validate the project's bundle configuration file
In this step, you check whether the bundle configuration is valid.
From the root directory, use the Databricks CLI to run the
bundle validate
command, as follows:databricks bundle validate
If a summary of the bundle configuration is returned, then the validation succeeded. If any errors are returned, fix the errors, and then repeat this step.
If you make any changes to your bundle after this step, you should repeat this step to check whether your bundle configuration is still valid.
Step 4: Build the Python wheel file and deploy the local project to the remote workspace
In this step, the Python wheel file is built and deployed to your remote Azure Databricks workspace, and a Azure Databricks job is created within your workspace.
Use the Databricks CLI to run the
bundle deploy
command as follows:databricks bundle deploy -t dev
To check whether the locally built Python wheel file was deployed:
- In your Azure Databricks workspace's sidebar, click Workspace.
- Click into the following folder: Workspace > Users >
<your-username>
> .bundle ><project-name>
> dev > artifacts > .internal ><random-guid>
.
The Python wheel file should be in this folder.
To check whether the job was created:
- In your Azure Databricks workspace's sidebar, click Jobs & Pipelines.
- Optionally, select the Jobs and Owned by me filters.
- Click [dev
<your-username>
]<project-name>
_job. - Click the Tasks tab.
There should be one task: main_task.
If you make any changes to your bundle after this step, repeat steps 3-4 to check whether your bundle configuration is still valid and then redeploy the project.
Step 5: Run the deployed project
In this step, you run the Azure Databricks job in your workspace.
From the root directory, use the Databricks CLI to run the
bundle run
command, as follows, replacing<project-name>
with the name of your project from Step 1:databricks bundle run -t dev <project-name>_job
Copy the value of
Run URL
that appears in your terminal and paste this value into your web browser to open your Azure Databricks workspace.In your Azure Databricks workspace, after the task completes successfully and shows a green title bar, click the main_task task to see the results.
Build the whl using Poetry or setuptools
When you use databricks bundle init
with the default-python template, a bundle is created that shows how to configure a bundle that builds a Python wheel using uv
and pyproject.toml
. However, you may want to use Poetry or setuptools
instead to build a wheel.
Install Poetry or setuptools
Install Poetry or
setuptools
:Poetry
- Install Poetry, version 1.6 or above, if it is not already installed. To check your installed version of Poetry, run the command
poetry -V
orpoetry --version
. - Make sure you have Python version 3.10 or above installed. To check your version of Python, run the command
python -V
orpython --version
.
Setuptools
Install the
wheel
andsetuptools
packages if they are not already installed, by running the following command:pip3 install --upgrade wheel setuptools
- Install Poetry, version 1.6 or above, if it is not already installed. To check your installed version of Poetry, run the command
If you intend to store this bundle with a Git provider, add a
.gitignore
file in the project's root, and add the following entries to this file:Poetry
.databricks dist
Setuptools
.databricks build dist src/my_package/my_package.egg-info
Add build files
In your bundle's root, create the following folders and files, depending on whether you use Poetry or
setuptools
for building Python wheel files:Poetry
├── src │ └── my_package │ ├── __init__.py │ ├── main.py │ └── my_module.py └── pyproject.toml
Setuptools
├── src │ └── my_package │ ├── __init__.py │ ├── main.py │ └── my_module.py └── setup.py
Add the following code to the
pyproject.toml
orsetup.py
file:Pyproject.toml
[tool.poetry] name = "my_package" version = "0.0.1" description = "<my-package-description>" authors = ["my-author-name <my-author-name>@<my-organization>"] [tool.poetry.dependencies] python = "^3.10" [build-system] requires = ["poetry-core"] build-backend = "poetry.core.masonry.api" [tool.poetry.scripts] main = "my_package.main:main"
- Replace
my-author-name
with your organization's primary contact name. - Replace
my-author-name>@<my-organization
with your organization's primary email contact address. - Replace
<my-package-description>
with a display description for your Python wheel file.
Setup.py
from setuptools import setup, find_packages import src setup( name = "my_package", version = "0.0.1", author = "<my-author-name>", url = "https://<my-url>", author_email = "<my-author-name>@<my-organization>", description = "<my-package-description>", packages=find_packages(where='./src'), package_dir={'': 'src'}, entry_points={ "packages": [ "main=my_package.main:main" ] }, install_requires=[ "setuptools" ] )
- Replace
https://<my-url>
with your organization's URL. - Replace
<my-author-name>
with your organization's primary contact name. - Replace
<my-author-name>@<my-organization>
with your organization's primary email contact address. - Replace
<my-package-description>
with a display description for your Python wheel file.
- Replace
Add artifacts bundle configuration
Add the
artifacts
mapping configuration to yourdatabricks.yml
to build thewhl
artifact:Poetry
This configuration runs the
poetry build
command and indicates the path to thepyproject.toml
file is in the same directory as thedatabricks.yml
file.Note
If you have already built a Python wheel file and just want to deploy it, then modify the following bundle configuration file by omitting the
artifacts
mapping. The Databricks CLI will then assume that the Python wheel file is already built and will automatically deploy the files that are specified in thelibraries
array'swhl
entries.bundle: name: my-wheel-bundle artifacts: default: type: whl build: poetry build path: . resources: jobs: wheel-job: name: wheel-job tasks: - task_key: wheel-task new_cluster: spark_version: 13.3.x-scala2.12 node_type_id: Standard_DS3_v2 data_security_mode: USER_ISOLATION num_workers: 1 python_wheel_task: entry_point: main package_name: my_package libraries: - whl: ./dist/*.whl targets: dev: workspace: host: <workspace-url>
Setuptools
This configuration runs the
setuptools
command and indicates the path to thesetup.py
file is in the same directory as thedatabricks.yml
file.bundle: name: my-wheel-bundle artifacts: default: type: whl build: python3 setup.py bdist wheel path: . resources: jobs: wheel-job: name: wheel-job tasks: - task_key: wheel-task new_cluster: spark_version: 13.3.x-scala2.12 node_type_id: Standard_DS3_v2 data_security_mode: USER_ISOLATION num_workers: 1 python_wheel_task: entry_point: main package_name: my_package libraries: - whl: ./dist/*.whl targets: dev: workspace: host: <workspace-url>
:::