Share via


Databricks Runtime 17.0 (Beta)

Important

Databricks Runtime 17.0 is in Beta. The contents of the supported environments might change during the Beta. Changes can include the list of packages or versions of installed packages.

The following release notes provide information about Databricks Runtime 17.0 (Beta), powered by Apache Spark 4.0.0.

Databricks released this beta version in May 2025.

Tip

To see release notes for Databricks Runtime versions that have reached end-of-support (EoS), see End-of-support Databricks Runtime release notes. The EoS Databricks Runtime versions have been retired and might not be updated.

DBR 17.0 (Beta) new and updated features

SQL procedure support

SQL scripts can now be encapsulated in a procedure stored as a reusable asset in Unity Catalog. You can create a procedure using the CREATE PROCEDURE command, and then call it using the CALL command.

Set a default collation for SQL Functions

Using the new DEFAULT COLLATION clause in the CREATE FUNCTION command defines the default collation used for STRING parameters, the return type, and STRING literals in the function body.

Recursive common table expressions (rCTE) support

Azure Databricks now supports navigation of hierarchical data using recursive common table expressions (rCTEs). Use a self-referencing CTE with UNION ALL to follow the recursive relationship.

ANSI SQL enabled by default

The default SQL dialect is now ANSI SQL. ANSI SQL is a well-established standard and will help protect users from unexpected or incorrect results. Read the Databricks ANSI enablement guide for more information.

PySpark and Spark Connect now support the DataFrames df.mergeInto API

PySpark and Spark Connect now support the df.mergeInto API, which was previously only available for Scala.

Support ALL CATALOGS in SHOW SCHEMAS

The SHOW SCHEMAS syntax is updated to accept the following syntax:

SHOW SCHEMAS [ { FROM | IN } { catalog_name | ALL CATALOGS } ] [ [ LIKE ] pattern ]

When ALL CATALOGS is specified in a a SHOW query, the execution iterates through all active catalogs that support namespaces using the catalog manager (DsV2). For each catalog, it includes the top-level namespaces.

The output attributes and schema of the command have been modified to add a catalog column indicating the catalog of the corresponding namespace. The new column is added to the end of the output attributes, as shown below:

Previous output

| Namespace        |
|------------------|
| test-namespace-1 |
| test-namespace-2 |

New output

| Namespace        | Catalog        |
|------------------|----------------|
| test-namespace-1 | test-catalog-1 |
| test-namespace-2 | test-catalog-2 |

Liquid clustering now compacts deletion vectors more efficiently

Delta tables with Liquid clustering now apply physical changes from deletion vectors more efficiently when OPTIMIZE is running. For more details, see Apply changes to Parquet data files.

Allow non-deterministic expressions in UPDATE/INSERT column values for MERGE operations

Azure Databricks now allows the use of non-deterministic expressions in updated and inserted column values of MERGE operations. However, non-deterministic expressions in the conditions of MERGE statements are not supported.

For example, you can now generate dynamic or random values for columns:

MERGE INTO target USING source
ON target.key = source.key
WHEN MATCHED THEN UPDATE SET target.value = source.value + rand()

This can be helpful for data privacy to obfuscate actual data while preserving the data properties (such as mean values or other computed columns).

Ignore and rescue empty structs for AutoLoader ingestion (especially Avro)

Auto Loader now rescues Avro data types with an empty schema since Delta table does not support ingestiom of empty struct-type data.

Change Delta MERGE Python and Scala APIs to return DataFrame instead of Unit

The Scala and Python MERGE APIs (such as DeltaMergeBuilder) now also return a DataFrame like the SQL API does, with the same results.

Behavioral changes

DBFS custom CA certificates are no longer supported

As part of the ongoing effort to deprecate data storage in the DBFS root and DBFS mounts, DBFS custom CA certificates are not supported in Databricks Runtime 17.0 and above. For recommendations on working with files, see Work with files on Azure Databricks.

Behavioral change for the Auto Loader incremental directory listing option

The value of the deprecated Auto Loader cloudFiles.useIncrementalListing option is now set to a default value of false . As a result, this change causes Auto Loader to perform a full directory listing each time it's run. Previously, the default value of the cloudFiles.useIncrementalListing option was auto, instructing Auto Loader to make a best-effort attempt at detecting if an incremental listing can be used with a directory.

Databricks recommends against using this option. Instead, use file notification mode with file events. If you want to continue to use the incremental listing feature, set cloudFiles.useIncrementalListing to auto in your code. When you set this value to auto, Auto Loader makes a best-effort attempt to do a full listing once every seven incremental listings, which matches the behavior of this option before this change.

To learn more about Auto Loader directory listing, see Auto Loader streams with directory listing mode.

Removed the "True cache misses" section in Spark UI

This changes removes support for the "Cache true misses size" metric (for both compressed and uncompressed caches). The "Cache writes misses" metric measures the same information.

Use the numLocalScanTasks as a viable proxy for this metric, when your intention is to see how the cache performs when files are assigned to the right executor.

Removed the "Cache Metadata Manager Peak Disk Usage" metric in the Spark UI

This change removes support for the cacheLocalityMgrDiskUsageInBytes and cacheLocalityMgrTimeMs metrics from the Databricks Runtime and the Spark UI.

Removed the "Rescheduled cache miss bytes" section in the Spark UI

Removed the cache rescheduled misses size and cache rescheduled misses size (uncompressed) metrics from DBR. This is done because this measures how the cache performs when files are assigned to non-preferred executors. numNonLocalScanTasks is a good proxy for this metric.

CREATE VIEW column-level clauses now throw errors when the clause would only apply to materialized views

CREATE VIEW commands which specify a column-level clause that is only valid for MATERIALIZED VIEWs now throw an error. The affected clauses for CREATE VIEW commands are:

  • NOT NULL
  • A specified datatype, such as FLOAT or STRING
  • DEFAULT
  • COLUMN MASK

Library upgrades

  • Upgraded Python libraries:

    • azure-core from 1.31.0 to 1.34.0
    • black from 24.4.2 to 24.10.0
    • boto3 from 1.34.69 to 1.36.2
    • botocore from 1.34.69 to 1.36.3
    • cachetools from 5.3.3 to 5.5.1
    • certifi from 2024.6.2 to 2025.1.31
    • cffi from 1.16.0 to 1.17.1
    • charset-normalizer from 2.0.4 to 3.3.2
    • cloudpickle from 2.2.1 to 3.0.0
    • contourpy from 1.2.0 to 1.3.1
    • cryptography from 42.0.5 to 43.0.3
    • Cython from 3.0.11 to 3.0.12
    • databricks-sdk from 0.30.0 to 0.49.0
    • debugpy from 1.6.7 to 1.8.11
    • Deprecated from 1.2.14 to 1.2.13
    • distlib from 0.3.8 to 0.3.9
    • filelock from 3.15.4 to 3.18.0
    • fonttools from 4.51.0 to 4.55.3
    • GitPython from 3.1.37 to 3.1.43
    • google-auth from 2.35.0 to 2.40.0
    • google-cloud-core from 2.4.1 to 2.4.3
    • google-cloud-storage from 2.18.2 to 3.1.0
    • google-crc32c from 1.6.0 to 1.7.1
    • grpcio from 1.60.0 to 1.67.0
    • grpcio-status from 1.60.0 to 1.67.0
    • importlib-metadata from 6.0.0 to 6.6.0
    • ipyflow-core from 0.0.201 to 0.0.209
    • ipykernel from 6.28.0 to 6.29.5
    • ipython from 8.25.0 to 8.30.0
    • ipywidgets from 7.7.2 to 7.8.1
    • jedi from 0.19.1 to 0.19.2
    • jupyter_client from 8.6.0 to 8.6.3
    • kiwisolver from 1.4.4 to 1.4.8
    • matplotlib from 3.8.4 to 3.10.0
    • matplotlib-inline from 0.1.6 to 0.1.7
    • mlflow-skinny from 2.19.0 to 2.22.0
    • numpy from 1.26.4 to 2.1.3
    • opentelemetry-api from 1.27.0 to 1.32.1
    • opentelemetry-sdk from 1.27.0 to 1.32.1
    • opentelemetry-semantic-conventions from 0.48b0 to 0.53b1
    • pandas from 1.5.3 to 2.2.3
    • parso from 0.8.3 to 0.8.4
    • patsy from 0.5.6 to 1.0.1
    • pillow from 10.3.0 to 11.1.0
    • plotly from 5.22.0 to 5.24.1
    • pluggy from 1.0.0 to 1.5.0
    • proto-plus from 1.24.0 to 1.26.1
    • protobuf from 4.24.1 to 5.29.4
    • pyarrow from 15.0.2 to 19.0.1
    • pyccolo from 0.0.65 to 0.0.71
    • pydantic from 2.8.2 to 2.10.6
    • pydantic_core from 2.20.1 to 2.27.2
    • PyJWT from 2.7.0 to 2.10.1
    • pyodbc from 5.0.1 to 5.2.0
    • pyparsing from 3.0.9 to 3.2.0
    • pyright from 1.1.294 to 1.1.394
    • python-lsp-server from 1.10.0 to 1.12.0
    • PyYAML from 6.0.1 to 6.0.2
    • pyzmq from 25.1.2 to 26.2.0
    • requests from 2.32.2 to 2.32.3
    • rsa from 4.9 to 4.9.1
    • s3transfer from 0.10.2 to 0.11.3
    • scikit-learn from 1.4.2 to 1.6.1
    • scipy from 1.13.1 to 1.15.1
    • sqlparse from 0.5.1 to 0.5.3
    • statsmodels from 0.14.2 to 0.14.4
    • tenacity from 8.2.2 to 9.0.0
    • threadpoolctl from 2.2.0 to 3.5.0
    • tornado from 6.4.1 to 6.4.2
    • typing_extensions from 4.11.0 to 4.12.2
    • urllib3 from 1.26.16 to 2.3.0
    • virtualenv from 20.26.2 to 20.29.3
    • wheel from 0.43.0 to 0.45.1
    • wrapt from 1.14.1 to 1.17.0
    • yapf from 0.33.0 to 0.40.2
    • zipp from 3.17.0 to 3.21.0
  • Upgraded R libraries:

    • arrow from 16.1.0 to 19.0.1
    • askpass from 1.2.0 to 1.2.1
    • base from 4.4.0 to 4.4.2
    • bigD from 0.2.0 to 0.3.0
    • bit from 4.0.5 to 4.6.0
    • bit64 from 4.0.5 to 4.6.0-1
    • bitops from 1.0-8 to 1.0-9
    • broom from 1.0.6 to 1.0.7
    • bslib from 0.8.0 to 0.9.0
    • caret from 6.0-94 to 7.0-1
    • chron from 2.3-61 to 2.3-62
    • cli from 3.6.3 to 3.6.4
    • clock from 0.7.1 to 0.7.2
    • commonmark from 1.9.1 to 1.9.5
    • compiler from 4.4.0 to 4.4.2
    • cpp11 from 0.4.7 to 0.5.2
    • credentials from 2.0.1 to 2.0.2
    • curl from 5.2.1 to 6.2.1
    • data.table from 1.15.4 to 1.17.0
    • datasets from 4.4.0 to 4.4.2
    • digest from 0.6.36 to 0.6.37
    • e1071 from 1.7-14 to 1.7-16
    • evaluate from 0.24.0 to 1.0.3
    • fontawesome from 0.5.2 to 0.5.3
    • fs from 1.6.4 to 1.6.5
    • future.apply from 1.11.2 to 1.11.3
    • gert from 2.1.0 to 2.1.4
    • git2r from 0.33.0 to 0.35.0
    • glue from 1.7.0 to 1.8.0
    • gower from 1.0.1 to 1.0.2
    • graphics from 4.4.0 to 4.4.2
    • grDevices from 4.4.0 to 4.4.2
    • grid from 4.4.0 to 4.4.2
    • gt from 0.11.0 to 0.11.1
    • gtable from 0.3.5 to 0.3.6
    • hardhat from 1.4.0 to 1.4.1
    • httr2 from 1.0.2 to 1.1.1
    • jsonlite from 1.8.8 to 1.9.1
    • knitr from 1.48 to 1.50
    • later from 1.3.2 to 1.4.1
    • lava from 1.8.0 to 1.8.1
    • lubridate from 1.9.3 to 1.9.4
    • methods from 4.4.0 to 4.4.2
    • mime from 0.12 to 0.13
    • mlflow from 2.14.1 to 2.20.4
    • nlme from 3.1-165 to 3.1-164
    • openssl from 2.2.0 to 2.3.2
    • parallel from 4.4.0 to 4.4.2
    • parallelly from 1.38.0 to 1.42.0
    • pillar from 1.9.0 to 1.10.1
    • pkgbuild from 1.4.4 to 1.4.6
    • pkgdown from 2.1.0 to 2.1.1
    • processx from 3.8.4 to 3.8.6
    • profvis from 0.3.8 to 0.4.0
    • progressr from 0.14.0 to 0.15.1
    • promises from 1.3.0 to 1.3.2
    • ps from 1.7.7 to 1.9.0
    • purrr from 1.0.2 to 1.0.4
    • R6 from 2.5.1 to 2.6.1
    • ragg from 1.3.2 to 1.3.3
    • randomForest from 4.7-1.1 to 4.7-1.2
    • Rcpp from 1.0.13 to 1.0.14
    • RcppEigen from 0.3.4.0.0 to 0.3.4.0.2
    • reactR from 0.6.0 to 0.6.1
    • readxl from 1.4.3 to 1.4.5
    • recipes from 1.1.0 to 1.2.0
    • rlang from 1.1.4 to 1.1.5
    • rmarkdown from 2.27 to 2.29
    • RODBC from 1.3-23 to 1.3-26
    • Rserve from 1.8-13 to 1.8-15
    • RSQLite from 2.3.7 to 2.3.9
    • rstudioapi from 0.16.0 to 0.17.1
    • sessioninfo from 1.2.2 to 1.2.3
    • shiny from 1.9.1 to 1.10.0
    • sparklyr from 1.8.6 to 1.9.0
    • SparkR from 3.5.2 to 4.0.0
    • splines from 4.4.0 to 4.4.2
    • stats from 4.4.0 to 4.4.2
    • stats4 from 4.4.0 to 4.4.2
    • survival from 3.6-4 to 3.5-8
    • sys from 3.4.2 to 3.4.3
    • systemfonts from 1.1.0 to 1.2.1
    • tcltk from 4.4.0 to 4.4.2
    • testthat from 3.2.1.1 to 3.2.3
    • textshaping from 0.4.0 to 1.0.0
    • timeDate from 4032.109 to 4041.110
    • tinytex from 0.52 to 0.56
    • tools from 4.4.0 to 4.4.2
    • tzdb from 0.4.0 to 0.5.0
    • usethis from 3.0.0 to 3.1.0
    • utils from 4.4.0 to 4.4.2
    • V8 from 4.4.2 to 6.0.2
    • waldo from 0.5.2 to 0.6.1
    • withr from 3.0.1 to 3.0.2
    • xfun from 0.46 to 0.51
    • xml2 from 1.3.6 to 1.3.8
    • zip from 2.3.1 to 2.3.2
  • Upgraded Java libraries:

    • com.clearspring.analytics.stream from 2.9.6 to 2.9.8
    • com.esotericsoftware.kryo-shaded from 4.0.2 to 4.0.3
    • com.fasterxml.classmate from 1.3.4 to 1.5.1
    • com.fasterxml.jackson.core.jackson-annotations from 2.15.2 to 2.18.2
    • com.fasterxml.jackson.core.jackson-core from 2.15.2 to 2.18.2
    • com.fasterxml.jackson.core.jackson-databind from 2.15.2 to 2.18.2
    • com.fasterxml.jackson.dataformat.jackson-dataformat-cbor from 2.15.2 to 2.18.2
    • com.fasterxml.jackson.datatype.jackson-datatype-joda from 2.15.2 to 2.18.2
    • com.fasterxml.jackson.datatype.jackson-datatype-jsr310 from 2.16.0 to 2.18.2
    • com.fasterxml.jackson.module.jackson-module-paranamer from 2.15.2 to 2.18.2
    • com.github.luben.zstd-jni from 1.5.5-4 to 1.5.6-10
    • com.google.code.gson.gson from 2.10.1 to 2.11.0
    • com.google.crypto.tink.tink from 1.9.0 to 1.16.0
    • com.google.errorprone.error_prone_annotations from 2.10.0 to 2.36.0
    • com.google.flatbuffers.flatbuffers-java from 23.5.26 to 24.3.25
    • com.google.guava.guava from 15.0 to 33.4.0-jre
    • com.google.protobuf.protobuf-java from 3.25.1 to 3.25.5
    • com.microsoft.azure.azure-data-lake-store-sdk from 2.3.9 to 2.3.10
    • com.microsoft.sqlserver.mssql-jdbc from 11.2.3.jre8 to 12.8.0.jre8
    • commons-cli.commons-cli from 1.5.0 to 1.9.0
    • commons-codec.commons-codec from 1.16.0 to 1.17.2
    • commons-io.commons-io from 2.13.0 to 2.18.0
    • io.airlift.aircompressor from 0.27 to 2.0.2
    • io.dropwizard.metrics.metrics-annotation from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-core from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-graphite from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-healthchecks from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-jetty9 from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-jmx from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-json from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-jvm from 4.2.19 to 4.2.30
    • io.dropwizard.metrics.metrics-servlets from 4.2.19 to 4.2.30
    • io.netty.netty-all from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-buffer from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-codec from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-codec-http from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-codec-http2 from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-codec-socks from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-common from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-handler from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-handler-proxy from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-resolver from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-tcnative-boringssl-static from 2.0.61.Final-db-r16-windows-x86_64 to 2.0.70.Final-db-r0-windows-x86_64
    • io.netty.netty-tcnative-classes from 2.0.61.Final to 2.0.70.Final
    • io.netty.netty-transport from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-transport-classes-epoll from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-transport-classes-kqueue from 4.1.108.Final to 4.1.118.Final
    • io.netty.netty-transport-native-epoll from 4.1.108.Final-linux-x86_64 to 4.1.118.Final-linux-x86_64
    • io.netty.netty-transport-native-kqueue from 4.1.108.Final-osx-x86_64 to 4.1.118.Final-osx-x86_64
    • io.netty.netty-transport-native-unix-common from 4.1.108.Final to 4.1.118.Final
    • io.prometheus.jmx.collector from 0.12.0 to 0.18.0
    • io.prometheus.simpleclient from 0.7.0 to 0.16.1-databricks
    • io.prometheus.simpleclient_common from 0.7.0 to 0.16.1-databricks
    • io.prometheus.simpleclient_dropwizard from 0.7.0 to 0.16.1-databricks
    • io.prometheus.simpleclient_pushgateway from 0.7.0 to 0.16.1-databricks
    • io.prometheus.simpleclient_servlet from 0.7.0 to 0.16.1-databricks
    • joda-time.joda-time from 2.12.1 to 2.13.0
    • net.razorvine.pickle from 1.3 to 1.5
    • org.antlr.antlr4-runtime from 4.9.3 to 4.13.1
    • org.apache.arrow.arrow-format from 15.0.0 to 18.2.0
    • org.apache.arrow.arrow-memory-core from 15.0.0 to 18.2.0
    • org.apache.arrow.arrow-memory-netty from 15.0.0 to 18.2.0
    • org.apache.arrow.arrow-vector from 15.0.0 to 18.2.0
    • org.apache.avro.avro from 1.11.3 to 1.12.0
    • org.apache.avro.avro-ipc from 1.11.3 to 1.12.0
    • org.apache.avro.avro-mapred from 1.11.3 to 1.12.0
    • org.apache.commons.commons-compress from 1.23.0 to 1.27.1
    • org.apache.commons.commons-lang3 from 3.12.0 to 3.17.0
    • org.apache.commons.commons-text from 1.10.0 to 1.13.0
    • org.apache.curator.curator-client from 2.13.0 to 5.7.1
    • org.apache.curator.curator-framework from 2.13.0 to 5.7.1
    • org.apache.curator.curator-recipes from 2.13.0 to 5.7.1
    • org.apache.datasketches.datasketches-java from 3.1.0 to 6.1.1
    • org.apache.datasketches.datasketches-memory from 2.0.0 to 3.0.2
    • org.apache.hadoop.hadoop-client-runtime from 3.3.6 to 3.4.1
    • org.apache.hive.hive-beeline from 2.3.9 to 2.3.10
    • org.apache.hive.hive-cli from 2.3.9 to 2.3.10
    • org.apache.hive.hive-jdbc from 2.3.9 to 2.3.10
    • org.apache.hive.hive-llap-client from 2.3.9 to 2.3.10
    • org.apache.hive.hive-llap-common from 2.3.9 to 2.3.10
    • org.apache.hive.hive-serde from 2.3.9 to 2.3.10
    • org.apache.hive.hive-shims from 2.3.9 to 2.3.10
    • org.apache.hive.shims.hive-shims-0.23 from 2.3.9 to 2.3.10
    • org.apache.hive.shims.hive-shims-common from 2.3.9 to 2.3.10
    • org.apache.hive.shims.hive-shims-scheduler from 2.3.9 to 2.3.10
    • org.apache.ivy.ivy from 2.5.2 to 2.5.3
    • org.apache.logging.log4j.log4j-1.2-api from 2.22.1 to 2.24.3
    • org.apache.logging.log4j.log4j-api from 2.22.1 to 2.24.3
    • org.apache.logging.log4j.log4j-core from 2.22.1 to 2.24.3
    • org.apache.logging.log4j.log4j-layout-template-json from 2.22.1 to 2.24.3
    • org.apache.logging.log4j.log4j-slf4j2-impl from 2.22.1 to 2.24.3
    • org.apache.orc.orc-core from 1.9.2-shaded-protobuf to 2.1.1-shaded-protobuf
    • org.apache.orc.orc-mapreduce from 1.9.2-shaded-protobuf to 2.1.1-shaded-protobuf
    • org.apache.orc.orc-shims from 1.9.2 to 2.1.1
    • org.apache.thrift.libthrift from 0.12.0 to 0.16.0
    • org.apache.ws.xmlschema.xmlschema-core from 2.3.0 to 2.3.1
    • org.apache.xbean.xbean-asm9-shaded from 4.23 to 4.26
    • org.apache.zookeeper.zookeeper from 3.9.2 to 3.9.3
    • org.apache.zookeeper.zookeeper-jute from 3.9.2 to 3.9.3
    • org.checkerframework.checker-qual from 3.31.0 to 3.43.0
    • org.eclipse.jetty.jetty-client from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-continuation from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-http from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-io from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-jndi from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-plus from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-proxy from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-security from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-server from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-servlet from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-servlets from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-util from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-util-ajax from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-webapp from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.jetty-xml from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.websocket.websocket-api from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.websocket.websocket-client from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.websocket.websocket-common from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.websocket.websocket-server from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.eclipse.jetty.websocket.websocket-servlet from 9.4.52.v20230823 to 9.4.53.v20231009
    • org.glassfish.jersey.containers.jersey-container-servlet from 2.40 to 2.41
    • org.glassfish.jersey.containers.jersey-container-servlet-core from 2.40 to 2.41
    • org.glassfish.jersey.core.jersey-client from 2.40 to 2.41
    • org.glassfish.jersey.core.jersey-common from 2.40 to 2.41
    • org.glassfish.jersey.core.jersey-server from 2.40 to 2.41
    • org.glassfish.jersey.inject.jersey-hk2 from 2.40 to 2.41
    • org.hibernate.validator.hibernate-validator from 6.1.7.Final to 6.2.5.Final
    • org.jboss.logging.jboss-logging from 3.3.2.Final to 3.4.1.Final
    • org.objenesis.objenesis from 2.5.1 to 3.3
    • org.roaringbitmap.RoaringBitmap from 0.9.45-databricks to 1.2.1
    • org.rocksdb.rocksdbjni from 9.2.1 to 9.8.4
    • org.scalatest.scalatest-compatible from 3.2.16 to 3.2.19
    • org.slf4j.jcl-over-slf4j from 2.0.7 to 2.0.16
    • org.slf4j.jul-to-slf4j from 2.0.7 to 2.0.16
    • org.slf4j.slf4j-api from 2.0.7 to 2.0.16
    • org.threeten.threeten-extra from 1.7.1 to 1.8.0
    • org.tukaani.xz from 1.9 to 1.10

Apache Spark

Many of its features were already available in Databricks Runtime 14.x, 15.x and 16.x, and now they ship out of the box with Runtime 17.0.

Core and Spark SQL highlights

Spark Core

Spark SQL

Features

Functions

Query optimization

  • [SPARK-46946] Supporting broadcast of multiple filtering keys in DynamicPruning
  • [SPARK-48445] Don’t inline UDFs with expansive children
  • [SPARK-41413] Avoid shuffle in Storage-Partitioned Join when partition keys mismatch, but expressions are compatible
  • [SPARK-46941] Prevent insertion of window group limit node with SizeBasedWindowFunction
  • [SPARK-46707] Add throwable field to expressions to improve predicate pushdown
  • [SPARK-47511] Canonicalize WITH expressions by reassigning IDs
  • [SPARK-46502] Support timestamp types in UnwrapCastInBinaryComparison
  • [SPARK-46069] Support unwrap timestamp type to date type
  • [SPARK-46219] Unwrap cast in join predicates
  • [SPARK-45606] Release restrictions on multi-layer runtime filter
  • [SPARK-45909] Remove NumericType cast if it can safely up-cast in IsNotNull

Query execution

  • [SPARK-45592][SPARK-45282] Correctness issue in AQE with InMemoryTableScanExec
  • [SPARK-50258] Fix output column order changed issue after AQE
  • [SPARK-46693] Inject LocalLimitExec when matching OffsetAndLimit or LimitAndOffset
  • [SPARK-48873] Use UnsafeRow in JSON parser
  • [SPARK-41471] Reduce Spark shuffle when only one side of a join is KeyGroupedPartitioning
  • [SPARK-45452] Improve InMemoryFileIndex to use FileSystem.listFiles API
  • [SPARK-48649] Add ignoreInvalidPartitionPaths configs for skipping invalid partition paths
  • [SPARK-45882] BroadcastHashJoinExec propagate partitioning should respect CoalescedHashPartitioning

Spark Connectors

DS v2 framework support changes

Hive Catalog support changes

XML support changes

CSV support changes

  • [SPARK-46862] Disable CSV column pruning in multi-line mode
  • [SPARK-46890] Fix CSV parsing bug with default values and column pruning
  • [SPARK-50616] Add File Extension Option to CSV DataSource Writer
  • [SPARK-49125] Allow duplicated column names in CSV writing
  • [SPARK-49016] Restore behavior for queries from raw CSV files
  • [SPARK-48807] Binary support for CSV datasource
  • [SPARK-48602] Make csv generator support different output style via spark.sql.binaryOutputStyle

ORC support changes

Avro support changes

JDBC changes

Other notable changes

  • [SPARK-45905] Least common type between decimal types should retain integral digits first
  • [SPARK-45786] Fix inaccurate Decimal multiplication and division results
  • [SPARK-50705] Make QueryPlan lock‑free
  • [SPARK-46743] Fix corner-case with COUNT + constant folding subquery
  • [SPARK-47509] Block subquery expressions in lambda/higher-order functions for correctness
  • [SPARK-48498] Always do char padding in predicates
  • [SPARK-45915] Treat decimal(x, 0) the same as IntegralType in PromoteStrings
  • [SPARK-46220] Restrict charsets in decode()
  • [SPARK-45816] Return NULL when overflowing during casting from timestamp to integers
  • [SPARK-45586] Reduce compiler latency for plans with large expression trees
  • [SPARK-45507] Correctness fix for nested correlated scalar subqueries with COUNT aggregates
  • [SPARK-44550] Enable correctness fixes for null IN (empty list) under ANSI
  • [SPARK-47911] Introduces a universal BinaryFormatter to make binary output consistent

PySpark

Below are the changes and improvements made to the PySpark libraries shipping in Databricks Runtime 17.0 (Beta).

Highlights

DataFrame APIs features

  • [SPARK-51079] Support large variable types in pandas UDF, createDataFrame and toPandas with Arrow
  • [SPARK-50718] Support addArtifact(s) for PySpark
  • [SPARK-50778] Add metadataColumn to PySpark DataFrame
  • [SPARK-50719] Support interruptOperation for PySpark
  • [SPARK-50790] Implement parse_json in PySpark
  • [SPARK-49306] Create SQL function aliases for zeroifnull and nullifzero
  • [SPARK-50132] Add DataFrame API for Lateral Joins
  • [SPARK-43295] Support string type columns for DataFrameGroupBy.sum
  • [SPARK-45575] Support time travel options for df.read API
  • [SPARK-45755] Improve Dataset.isEmpty() by applying global limit 1
    • Improves performance of isEmpty() by pushing down a global limit of 1.
  • [SPARK-48761] Introduce clusterBy DataFrameWriter API for Scala
  • [SPARK-45929] Support groupingSets operation in DataFrame API
    • Extends groupingSets(...) to DataFrame/DS-level APIs.
  • [SPARK-40178] Support coalesce hints with ease for PySpark and R

Pandas API on Spark features

Other notable PySpark changes

Spark Streaming

Below are the changes and improvements made to Spark Streaming in Databricks Runtime 17.0 (Beta).

Highlights

Other notable streaming changes

  • [SPARK-44865] Make StreamingRelationV2 support metadata column
  • [SPARK-45080] Explicitly call out support for columnar in DSv2 streaming data sources
  • [SPARK-45178] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources
  • [SPARK-45415] Allow selective disabling of "fallocate" in RocksDB statestore
  • [SPARK-45503] Add Conf to Set RocksDB Compression
  • [SPARK-45511] State Data Source - Reader
  • [SPARK-45558] Introduce a metadata file for streaming stateful operator
  • [SPARK-45794] Introduce state metadata source to query the streaming state metadata information
  • [SPARK-45815] Provide an interface for other Streaming sources to add _metadata columns
  • [SPARK-45845] Add number of evicted state rows to streaming UI
  • [SPARK-46641] Add maxBytesPerTrigger threshold
  • [SPARK-46816] Add base support for new arbitrary state management operator (multiple state variables/column families)
  • [SPARK-46865] Add Batch Support for TransformWithState Operator
  • [SPARK-46906] Add a check for stateful operator change for streaming
  • [SPARK-46961] Use ProcessorContext to store and retrieve handle
  • [SPARK-46962] Add interface for Python streaming data source & worker
  • [SPARK-47107] Partition reader for Python streaming data sources
  • [SPARK-47273] Python data stream writer interface
  • [SPARK-47553] Add Java support for transformWithState operator APIs
  • [SPARK-47653] Add support for negative numeric types and range scan key encoder
  • [SPARK-47733] Add custom metrics for transformWithState operator part of query progress
  • [SPARK-47960] Allow chaining other stateful operators after transformWithState
  • [SPARK-48447] Check StateStoreProvider class before constructor
  • [SPARK-48569] Handle edge cases in query.name for streaming queries
  • [SPARK-48589] Add snapshotStartBatchId / snapshotPartitionId for state data source (see SQL)
  • [SPARK-48589] Add snapshotStartBatchId / snapshotPartitionId options to state data source
  • [SPARK-48726] Create StateSchemaV3 file for TransformWithStateExec
  • [SPARK-48742] Virtual Column Family for RocksDB (arbitrary stateful API v2)
  • [SPARK-48755] transformWithState pyspark base implementation and ValueState support
  • [SPARK-48772] State Data Source Change Feed Reader Mode
  • [SPARK-48836] Integrate SQL schema with state schema/metadata for TWS operator
  • [SPARK-48849] Create OperatorStateMetadataV2 for TransformWithStateExec operator
  • [SPARK-48901][SPARK-48916] Introduce clusterBy DataStreamWriter API in Scala/PySpark
  • [SPARK-48931] Reduce Cloud Store List API cost for state-store maintenance
  • [SPARK-49021] Add support for reading transformWithState value state variables with state data source reader
  • [SPARK-49048] Add support for reading operator metadata at given batch id
  • [SPARK-49191] Read transformWithState map state with state data source
  • [SPARK-49259] Size-based partition creation during Kafka read
  • [SPARK-49411] Communicate State Store Checkpoint ID
  • [SPARK-49463] ListState support in TransformWithStateInPandas
  • [SPARK-49467] Add state data source reader for list state
  • [SPARK-49513] Add timer support in transformWithStateInPandas
  • [SPARK-49630] Add flatten option for collection types in state data source reader
  • [SPARK-49656] Support state variables with value state collection types
  • [SPARK-49676] Chaining of operators in transformWithStateInPandas
  • [SPARK-49699] Disable PruneFilters for streaming workloads
  • [SPARK-49744] TTL support for ListState in TransformWithStateInPandas
  • [SPARK-49745] Read registered timers in transformWithState
  • [SPARK-49802] Add support for read change feed for map/list types
  • [SPARK-49846] Add numUpdatedStateRows/numRemovedStateRows metrics
  • [SPARK-49883] State Store Checkpoint Structure V2 Integration with RocksDB and RocksDBFileManager
  • [SPARK-50017] Support Avro encoding for TransformWithState operator
  • [SPARK-50035] Explicit handleExpiredTimer function in the stateful processor
  • [SPARK-50128] Add handle APIs using implicit encoders
  • [SPARK-50152] Support handleInitialState with state data source reader
  • [SPARK-50194] Integration of New Timer API and Initial State API
  • [SPARK-50378] Add custom metric for time spent populating initial state
  • [SPARK-50428] Support TransformWithStateInPandas in batch queries
  • [SPARK-50573] Adding State Schema ID to State Rows for schema evolution
  • [SPARK-50714] Enable schema evolution for TransformWithState with Avro encoding

Spark ML

Spark UX

Other notable Spark UX changes

Spark Connect

Below are the changes and improvements made to Spark Connect in Databricks Runtime 17.0 (Beta).

Highlights

  • [SPARK-49248] Scala Client Parity with existing Dataset/DataFrame API
  • [SPARK-48918] Create a unified SQL Scala interface shared by regular SQL and Connect
  • [SPARK-50812] Support pyspark.ml on Connect
  • [SPARK-47908] Parent classes for Spark Connect and Spark Classic

Other Spark Connect changes and improvements

  • [SPARK-41065] Implement DataFrame.freqItems and DataFrame.stat.freqItems
  • [SPARK-41066] Implement DataFrame.sampleBy and DataFrame.stat.sampleBy
  • [SPARK-41067] Implement DataFrame.stat.cov
  • [SPARK-41068] Implement DataFrame.stat.corr
  • [SPARK-41069] Implement DataFrame.approxQuantile and DataFrame.stat.approxQuantile
  • [SPARK-41292][SPARK-41640][SPARK-41641] Implement Window functions
  • [SPARK-41333][SPARK-41737] Implement GroupedData.{min, max, avg, sum}
  • [SPARK-41364] Implement broadcast function
  • [SPARK-41383][SPARK-41692][SPARK-41693] Implement rollup, cube, and pivot
  • [SPARK-41434] Initial LambdaFunction implementation
  • [SPARK-41440] Implement DataFrame.randomSplit
  • [SPARK-41464] Implement DataFrame.to
  • [SPARK-41473] Implement format_number function
  • [SPARK-41503] Implement Partition Transformation Functions
  • [SPARK-41529] Implement SparkSession.stop
  • [SPARK-41534] Setup initial client module for Spark Connect
  • [SPARK-41629] Support for Protocol Extensions in Relation and Expression
  • [SPARK-41663] Implement the rest of Lambda functions
  • [SPARK-41673] Implement Column.astype
  • [SPARK-41690] Agnostic Encoders
  • [SPARK-41707] Implement Catalog API in Spark Connect
  • [SPARK-41710] Implement Column.between
  • [SPARK-41722] Implement 3 missing time window functions
  • [SPARK-41723] Implement sequence function
  • [SPARK-41724] Implement call_udf function
  • [SPARK-41728] Implement unwrap_udt function
  • [SPARK-41731] Implement the column accessor (getItem, getField, getitem, etc.)
  • [SPARK-41738] Mix ClientId in SparkSession cache
  • [SPARK-41740] Implement Column.name
  • [SPARK-41767] Implement Column.{withField, dropFields}
  • [SPARK-41785] Implement GroupedData.mean
  • [SPARK-41803] Add missing function log(arg1, arg2)
  • [SPARK-41810] Infer names from a list of dictionaries in SparkSession.createDataFrame
  • [SPARK-41811] Implement SQLStringFormatter with WithRelations
  • [SPARK-42664] Support bloomFilter function for DataFrameStatFunctions
  • [SPARK-43662] Support merge_asof in Spark Connect
  • [SPARK-43704] Support MultiIndex for to_series() in Spark Connect
  • [SPARK-44625] SparkConnectExecutionManager to track all executions
  • [SPARK-44731] Make TimestampNTZ work with literals in Python Spark Connect
  • [SPARK-44736] Add Dataset.explode to Spark Connect Scala Client
  • [SPARK-44740] Support specifying session_id in SPARK_REMOTE connection string
  • [SPARK-44747] Add missing SparkSession.Builder methods
  • [SPARK-44750] Apply configuration to SparkSession during creation
  • [SPARK-44761] Support DataStreamWriter.foreachBatch(VoidFunction2)
  • [SPARK-44788] Add from_xml and schema_of_xml to pyspark, Spark Connect, and SQL functions
  • [SPARK-44807] Add Dataset.metadataColumn to Scala Client
  • [SPARK-44877] Support python protobuf functions for Spark Connect
  • [SPARK-45000] Implement DataFrame.foreach
  • [SPARK-45001] Implement DataFrame.foreachPartition
  • [SPARK-45088] Make getitem work with duplicated columns
  • [SPARK-45090] DataFrame.{cube, rollup} support column ordinals
  • [SPARK-45091] Function floor/round/bround now accept Column type scale
  • [SPARK-45121] Support Series.empty for Spark Connect
  • [SPARK-45136] Enhance ClosureCleaner with Ammonite support
  • [SPARK-45137] Support map/array parameters in parameterized sql()
  • [SPARK-45143] Make PySpark compatible with PyArrow 13.0.0
  • [SPARK-45190][SPARK-48897] Make from_xml support StructType schema
  • [SPARK-45235] Support map and array parameters by sql()
  • [SPARK-45485] User agent improvements: Use SPARK_CONNECT_USER_AGENT env variable and include environment specific attributes
  • [SPARK-45506] Add ivy URI support to SparkcConnect addArtifact
  • [SPARK-45509] Fix df column reference behavior for Spark Connect
  • [SPARK-45619] Apply the observed metrics to Observation object
  • [SPARK-45680] Release session
  • [SPARK-45733] Support multiple retry policies
  • [SPARK-45770] Introduce plan DataFrameDropColumns for Dataframe.drop
  • [SPARK-45851] Support multiple policies in scala client
  • [SPARK-46039] Upgrade grpcio\* to 1.59.3 for Python 3.12
  • [SPARK-46048] Support DataFrame.groupingSets in Python Spark Connect
  • [SPARK-46085] Dataset.groupingSets in Scala Spark Connect client
  • [SPARK-46202] Expose new ArtifactManager APIs to support custom target directories
  • [SPARK-46229] Add applyInArrow to groupBy and cogroup in Spark Connect
  • [SPARK-46255] Support complex type -> string conversion
  • [SPARK-46620] Introduce a basic fallback mechanism for frame methods
  • [SPARK-46812] Make mapInPandas/mapInArrow support ResourceProfile
  • [SPARK-46919] Upgrade grpcio* and grpc-java to 1.62.x
  • [SPARK-47014] Implement methods dumpPerfProfile and dumpMemoryProfiles of SparkSession
  • [SPARK-47069] Introduce spark.profile.show/.dump for SparkSession-based profiling
  • [SPARK-47081] Support Query Execution Progress
  • [SPARK-47137] Add getAll to spark.conf for feature parity with Scala
  • [SPARK-47233] Client & Server logic for client-side streaming query listener
  • [SPARK-47276] Introduce spark.profile.clear for SparkSession-based profiling
  • [SPARK-47367] Support Python data sources with Spark Connect
  • [SPARK-47543] Infer dict as MapType from Pandas DataFrame (via new config)
  • [SPARK-47545] Dataset.observe for Scala Connect
  • [SPARK-47694] Make max message size configurable on the client side
  • [SPARK-47712] Allow connect plugins to create and process Datasets
  • [SPARK-47812] Support Serialization of SparkSession for ForEachBatch worker
  • [SPARK-47818] Introduce plan cache in SparkConnectPlanner to improve performance of Analyze requests
  • [SPARK-47828] Fix DataFrameWriterV2.overwrite failure due to invalid plan
  • [SPARK-47845] Support Column type in split function for Scala and Python
  • [SPARK-47909] Parent DataFrame class for Spark Connect and Spark Classic
  • [SPARK-48008] Support UDAFs in Spark Connect
  • [SPARK-48048] Added client side listener support for Scala
  • [SPARK-48058][SPARK-43727] UserDefinedFunction.returnType parse the DDL string
  • [SPARK-48112] Expose session in SparkConnectPlanner to plugins
  • [SPARK-48113] Allow Plugins to integrate with Spark Connect
  • [SPARK-48258] Checkpoint and localCheckpoint in Spark Connect
  • [SPARK-48278] Refine the string representation of Cast
  • [SPARK-48310] Cached properties must return copies
  • [SPARK-48336] Implement ps.sql in Spark Connect
  • [SPARK-48370] Checkpoint and localCheckpoint in Scala Spark Connect client
  • [SPARK-48510] Support UDAF toColumn API in Spark Connect
  • [SPARK-48555] Support using Columns as parameters for several functions (array_remove, array_position, etc.)
  • [SPARK-48569] Handle edge cases in query.name for streaming queries
  • [SPARK-48638] Add ExecutionInfo support for DataFrame
  • [SPARK-48639] Add Origin to RelationCommon
  • [SPARK-48648] Make SparkConnectClient.tags properly thread-local
  • [SPARK-48794] DataFrame.mergeInto support for Spark Connect (Scala & Python)
  • [SPARK-48831] Make default column name of cast compatible with Spark Classic
  • [SPARK-48960] Makes spark‑shell work with Spark Connect (–remote support)
  • [SPARK-49025] Make Column implementation agnostic
  • [SPARK-49027] Share Column API between Classic and Connect
  • [SPARK-49028] Create a shared SparkSession
  • [SPARK-49029] Create shared Dataset interface
  • [SPARK-49087] Distinguish UnresolvedFunction calling internal functions
  • [SPARK-49185] Reimplement kde plot with Spark SQL
  • [SPARK-49201] Reimplement hist plot with Spark SQL
  • [SPARK-49249][SPARK-49122] Add addArtifact API to the Spark SQL Core
  • [SPARK-49273] Origin support for Spark Connect Scala client
  • [SPARK-49282] Create a shared SparkSessionBuilder interface
  • [SPARK-49284] Create a shared Catalog interface
  • [SPARK-49413] Create a shared RuntimeConfig interface
  • [SPARK-49416] Add shared DataStreamReader interface
  • [SPARK-49417] Add shared StreamingQueryManager interface
  • [SPARK-49419] Create shared DataFrameStatFunctions
  • [SPARK-49429] Add shared DataStreamWriter interface
  • [SPARK-49526] Support Windows-style paths in ArtifactManager
  • [SPARK-49530] Support kde/density plots
  • [SPARK-49531] Support line plot with plotly backend
  • [SPARK-49595] Fix DataFrame.unpivot and DataFrame.melt in Spark Connect Scala Client
  • [SPARK-49626] Support horizontal/vertical bar plots
  • [SPARK-49907] Support spark.ml on Connect
  • [SPARK-49948] Add “precision” parameter to pandas on Spark box plot
  • [SPARK-50050] Make lit accept str/bool numpy ndarray
  • [SPARK-50054] Support histogram plots
  • [SPARK-50063] Add support for Variant in the Spark Connect Scala client
  • [SPARK-50075] DataFrame APIs for table-valued functions
  • [SPARK-50134][SPARK-50130] Support DataFrame API for SCALAR and EXISTS subqueries in Spark Connect
  • [SPARK-50134][SPARK-50132] Support DataFrame API for Lateral Join in Spark Connect
  • [SPARK-50227] Upgrade buf plugins to v28.3
  • [SPARK-50298] Implement verifySchema parameter of createDataFrame
  • [SPARK-50306] Support Python 3.13 in Spark Connect
  • [SPARK-50373] Prohibit Variant from set operations
  • [SPARK-50544] Implement StructType.toDDL
  • [SPARK-50710] Add support for optional client reconnection to sessions after release
  • [SPARK-50828] Deprecate pyspark.ml.connect
  • [SPARK-46465] Add Column.isNaN in PySpark
    • Adds the Column.isNaN function to PySpark Connect, matching Scala API parity.
  • [SPARK-41440] Implement DataFrame.randomSplit
    • Implements DataFrame.randomSplit for Spark Connect in Python.
  • [SPARK-41434] Initial LambdaFunction implementation
    • Adds basic support for LambdaFunction and an initial exists function in Spark Connect.
  • [SPARK-41464] Implement DataFrame.to
    • Implements DataFrame.to for Spark Connect in Python.
  • [SPARK-41364] Implement broadcast function
    • Implements the broadcast function in Spark Connect Python client.
  • [SPARK-41663] Implement the rest of Lambda functions
    • Completes Lambda function support in Spark Connect Python client (such as filter, map, etc.).
  • [SPARK-41673] Implement Column.astype
    • Adds Column.astype to Spark Connect Python for type casting.
  • [SPARK-41292][SPARK-41640][SPARK-41641] Implement Window functions
    • Adds support for window functions (Window.partitionBy, Window.orderBy, etc.) to Spark Connect.
  • [SPARK-41534] Setup initial client module for Spark Connect
    • Sets up the initial Scala/JVM client module for Spark Connect.
  • [SPARK-41503] Implement Partition Transformation Functions
    • Implements partition transformation functions for Spark Connect in Python.
  • [SPARK-41710] Implement Column.between
    • Adds Column.between method to Spark Connect in Python.
  • [SPARK-41707] Implement Catalog API in Spark Connect
    • Implements the catalog API for Spark Connect (such as listTables, listFunctions, etc.).
  • [SPARK-41690] Agnostic Encoders
    • Introduces “agnostic encoders” for mapping external types to Spark data types.
  • [SPARK-41722] Implement 3 missing time window functions
    • Implements window, window_time, and session_window in Spark Connect Python.
  • [SPARK-41723] Implement sequence function
    • Adds the sequence function for Spark Connect in Python.
  • [SPARK-41473] Implement format_number function
    • Implements format_number function in Spark Connect Python.
  • [SPARK-41724] Implement call_udf function
    • Allows users to call a UDF by name: call_udf("my_udf", col1, col2, ...).
  • [SPARK-41529] Implement SparkSession.stop
    • Implements SparkSession.stop to shut down a Spark Connect session server side.
  • [SPARK-41728] Implement unwrap_udt function
    • Adds the unwrap_udt function to Spark Connect in Python.
  • [SPARK-41731] Implement the column accessor (getItem, getField, getitem, etc.)
    • Allows indexing into arrays and structs in Spark Connect columns.
  • [SPARK-41740] Implement Column.name
    • Adds .name method for columns in Spark Connect Python.
  • [SPARK-41738] Mix ClientId in SparkSession cache
    • Fixes concurrency by mixing client ID into the SparkSession cache on the server.
  • [SPARK-41067] Implement DataFrame.stat.cov
    • Implements covariance calculation (df.stat.cov) for Spark Connect in Python.
  • [SPARK-41767] Implement Column.{withField, dropFields}
    • Adds support for adding/dropping struct fields in Spark Connect columns.
  • [SPARK-41292] Support Window in pyspark.sql.window namespace
    • Integrates Spark Connect’s window functionality into pyspark.sql.window.
  • [SPARK-41068] Implement DataFrame.stat.corr
    • Implements correlation calculation (df.stat.corr) for Spark Connect in Python.
  • [SPARK-41629] Support for Protocol Extensions in Relation and Expression
    • Adds plugin-based extension mechanism for custom Relation/Expression in Spark Connect.
  • [SPARK-41785] Implement GroupedData.mean
    • Adds the mean function to grouped data in Spark Connect.
  • [SPARK-41069] Implement DataFrame.approxQuantile and DataFrame.stat.approxQuantile
    • Adds approxQuantile for Spark Connect DataFrame/stat in Python.
  • [SPARK-41065] Implement DataFrame.freqItems and DataFrame.stat.freqItems
    • Adds freqItems to Spark Connect DataFrame in Python.
  • [SPARK-41066] Implement DataFrame.sampleBy and DataFrame.stat.sampleBy
    • Adds sampleBy to Spark Connect DataFrame in Python.
  • [SPARK-41810] Infer names from a list of dictionaries in SparkSession.createDataFrame
    • Improves column name inference when creating DataFrames from lists of dictionaries in Spark Connect.
  • [SPARK-41803] Add missing function log(arg1, arg2)
    • Implements two-argument log(base, expr) in Spark Connect Python.
  • [SPARK-41383][SPARK-41692][SPARK-41693] Implement rollup, cube, and pivot
    • Adds DataFrame.rollup, DataFrame.cube, and pivot to Spark Connect.
  • [SPARK-41333][SPARK-41737] Implement GroupedData.{min, max, avg, sum}
    • Implements the standard aggregate functions on grouped data for Spark Connect.
  • [SPARK-45680] Release session
    • Introduces ReleaseSession RPC to cancel all running jobs and remove the session server side.
  • [SPARK-45851] Support multiple policies in scala client
    • Adds multiple retry policies to the Scala Spark Connect client.
  • [SPARK-45990][SPARK-45987] Upgrade protobuf to 4.25.1 for Python 3.11 support
    • Updates protobuf library to fix issues under Python 3.11.
  • [SPARK-46202] Expose new ArtifactManager APIs to support custom target directories
    • Allows adding artifacts with a custom directory structure to remote Spark Connect sessions.
  • [SPARK-46284] Add session_user function to Python
    • Exposes the session_user function in PySpark for Connect, matching Scala parity.
  • [SPARK-46039] Upgrade grpcio\* to 1.59.3 for Python 3.12
    • Updates gRPC libraries to support Python 3.12 and new grpc-inprocess.
  • [SPARK-46048] Support DataFrame.groupingSets in Python Spark Connect
    • Allows calling df.groupingSets(...) in Python Spark Connect for multi-dimensional grouping.
  • [SPARK-46085] Dataset.groupingSets in Scala Spark Connect client
    • Adds groupingSets(...) to Spark Connect in Scala.
  • [SPARK-46229] Add applyInArrow to groupBy and cogroup in Spark Connect
    • Implements applyInArrow in Spark Connect for grouped/cogrouped DataFrame operations.
  • [SPARK-46255] Support complex type -> string conversion
    • Allows string conversion of complex (list/struct) types in Spark Connect Python.
  • [SPARK-45770] Introduce plan DataFrameDropColumns for Dataframe.drop
  • [SPARK-45733] Support multiple retry policies
  • [SPARK-45485] User agent improvements: Use SPARK_CONNECT_USER_AGENT env variable and include environment specific attributes
  • [SPARK-44753] XML: pyspark SQL XML reader/writer
  • [SPARK-45619] Apply the observed metrics to Observation object
  • [SPARK-45088] Make getitem work with duplicated columns
  • [SPARK-45091] Function floor/round/bround now accept Column type scale
  • [SPARK-45143] Make PySpark compatible with PyArrow 13.0.0
  • [SPARK-44788] Add from_xml and schema_of_xml to pyspark, Spark Connect, and SQL functions
  • [SPARK-45137] Support map/array parameters in parameterized sql()
  • [SPARK-45235] Support map and array parameters by sql()
  • [SPARK-43662] Support merge_asof in Spark Connect
  • [SPARK-45121] Support Series.empty for Spark Connect
  • [SPARK-45090] DataFrame.{cube, rollup} support column ordinals
  • [SPARK-45136] Enhance ClosureCleaner with Ammonite support
  • [SPARK-45506] Add ivy URI support to SparkcConnect addArtifact
  • [SPARK-43704] Support MultiIndex for to_series() in Spark Connect
  • [SPARK-44807] Add Dataset.metadataColumn to Scala Client
  • [SPARK-44877] Support python protobuf functions for Spark Connect
  • [SPARK-44750] Apply configuration to SparkSession during creation
  • [SPARK-45000] Implement DataFrame.foreach
  • [SPARK-45001] Implement DataFrame.foreachPartition
  • [SPARK-44740] Support specifying session_id in SPARK_REMOTE connection string
  • [SPARK-44747] Add missing SparkSession.Builder methods
  • [SPARK-44731] Make TimestampNTZ work with literals in Python Spark Connect
  • [SPARK-44761] Support DataStreamWriter.foreachBatch(VoidFunction2)
  • [SPARK-44625] SparkConnectExecutionManager to track all executions
  • [SPARK-44736] Add Dataset.explode to Spark Connect Scala Client
  • [SPARK-42664] Support bloomFilter function for DataFrameStatFunctions
  • [SPARK-48831] Align default cast column name with Spark Classic (Connect)
  • [SPARK-48272] timestamp_diff function added (Connect duplicate above)
  • [SPARK-48369] timestamp_add function added (Connect duplicate above)
  • [SPARK-48336] ps.sql in Spark Connect (duplicate)
  • [SPARK-48370] Checkpoint in Scala Connect client (duplicate above)
  • [SPARK-47545] Dataset.observe for Scala Connect (duplicate)
  • [SPARK-45509] Fix df column reference behavior for Spark Connect Aligns column resolution in Spark Connect with classic Spark and provides better error messages.

System environment

  • Operating System: Ubuntu 24.04.2 LTS
  • Java: Zulu17.54+21-CA
  • Scala: 2.13.16
  • Python: 3.12.3
  • R: 4.4.2
  • Delta Lake: 3.3.1

Installed Python libraries

Library Version Library Version Library Version
annotated-types 0.7.0 anyio 4.6.2 argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.0.5
astunparse 1.6.3 async-lru 2.0.4 attrs 24.3.0
autocommand 2.2.2 azure-common 1.1.28 azure-core 1.34.0
azure-identity 1.20.0 azure-mgmt-core 1.5.0 azure-mgmt-web 8.0.0
azure-storage-blob 12.23.0 azure-storage-file-datalake 12.17.0 babel 2.16.0
backports.tarfile 1.2.0 beautifulsoup4 4.12.3 black 24.10.0
bleach 6.2.0 blinker 1.7.0 boto3 1.36.2
botocore 1.36.3 cachetools 5.5.1 certifi 2025.1.31
cffi 1.17.1 chardet 4.0.0 charset-normalizer 3.3.2
click 8.1.7 cloudpickle 3.0.0 comm 0.2.1
contourpy 1.3.1 cryptography 43.0.3 cycler 0.11.0
Cython 3.0.12 databricks-sdk 0.49.0 dbus-python 1.3.2
debugpy 1.8.11 decorator 5.1.1 defusedxml 0.7.1
Deprecated 1.2.13 distlib 0.3.9 docstring-to-markdown 0.11
executing 0.8.3 facets-overview 1.1.1 fastapi 0.115.12
fastjsonschema 2.21.1 filelock 3.18.0 fonttools 4.55.3
fqdn 1.5.1 fsspec 2023.5.0 gitdb 4.0.11
GitPython 3.1.43 google-api-core 2.20.0 google-auth 2.40.0
google-cloud-core 2.4.3 google-cloud-storage 3.1.0 google-crc32c 1.7.1
google-resumable-media 2.7.2 googleapis-common-protos 1.65.0 grpcio 1.67.0
grpcio-status 1.67.0 h11 0.14.0 httpcore 1.0.2
httplib2 0.20.4 httpx 0.27.0 idna 3.7
importlib-metadata 6.6.0 importlib_resources 6.4.0 inflect 7.3.1
iniconfig 1.1.1 ipyflow-core 0.0.209 ipykernel 6.29.5
ipython 8.30.0 ipython-genutils 0.2.0 ipywidgets 7.8.1
isodate 0.6.1 isoduration 20.11.0 jaraco.context 5.3.0
jaraco.functools 4.0.1 jaraco.text 3.12.1 jedi 0.19.2
Jinja2 3.1.5 jmespath 1.0.1 joblib 1.4.2
json5 0.9.25 jsonpointer 3.0.0 jsonschema 4.23.0
jsonschema-specifications 2023.7.1 jupyter-events 0.10.0 jupyter-lsp 2.2.0
jupyter_client 8.6.3 jupyter_core 5.7.2 jupyter_server 2.14.1
jupyter_server_terminals 0.4.4 jupyterlab 4.3.4 jupyterlab-pygments 0.1.2
jupyterlab-widgets 1.0.0 jupyterlab_server 2.27.3 kiwisolver 1.4.8
launchpadlib 1.11.0 lazr.restfulclient 0.14.6 lazr.uri 1.0.6
markdown-it-py 2.2.0 MarkupSafe 3.0.2 matplotlib 3.10.0
matplotlib-inline 0.1.7 mccabe 0.7.0 mdurl 0.1.0
mistune 2.0.4 mlflow-skinny 2.22.0 mmh3 5.1.0
more-itertools 10.3.0 msal 1.32.3 msal-extensions 1.3.1
mypy-extensions 1.0.0 nbclient 0.8.0 nbconvert 7.16.4
nbformat 5.10.4 nest-asyncio 1.6.0 nodeenv 1.9.1
notebook 7.3.2 notebook_shim 0.2.3 numpy 2.1.3
oauthlib 3.2.2 opentelemetry-api 1.32.1 opentelemetry-sdk 1.32.1
opentelemetry-semantic-conventions 0.53b1 overrides 7.4.0 packaging 24.1
pandas 2.2.3 pandocfilters 1.5.0 parso 0.8.4
pathspec 0.10.3 patsy 1.0.1 pexpect 4.8.0
pillow 11.1.0 pip 24.2 platformdirs 3.10.0
plotly 5.24.1 pluggy 1.5.0 prometheus_client 0.21.0
prompt-toolkit 3.0.43 proto-plus 1.26.1 protobuf 5.29.4
psutil 5.9.0 psycopg2 2.9.3 ptyprocess 0.7.0
pure-eval 0.2.2 pyarrow 19.0.1 pyasn1 0.4.8
pyasn1-modules 0.2.8 pyccolo 0.0.71 pycparser 2.21
pydantic 2.10.6 pydantic_core 2.27.2 pyflakes 3.2.0
Pygments 2.15.1 PyGObject 3.48.2 pyiceberg 0.9.0
PyJWT 2.10.1 pyodbc 5.2.0 pyparsing 3.2.0
pyright 1.1.394 pytest 8.3.5 python-dateutil 2.9.0.post0
python-json-logger 3.2.1 python-lsp-jsonrpc 1.1.2 python-lsp-server 1.12.0
pytoolconfig 1.2.6 pytz 2024.1 PyYAML 6.0.2
pyzmq 26.2.0 referencing 0.30.2 requests 2.32.3
rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.9.4
rope 1.12.0 rpds-py 0.22.3 rsa 4.9.1
s3transfer 0.11.3 scikit-learn 1.6.1 scipy 1.15.1
seaborn 0.13.2 Send2Trash 1.8.2 setuptools 74.0.0
six 1.16.0 smmap 5.0.0 sniffio 1.3.0
sortedcontainers 2.4.0 soupsieve 2.5 sqlparse 0.5.3
ssh-import-id 5.11 stack-data 0.2.0 starlette 0.46.2
statsmodels 0.14.4 strictyaml 1.7.3 tenacity 9.0.0
terminado 0.17.1 threadpoolctl 3.5.0 tinycss2 1.4.0
tokenize_rt 6.1.0 tomli 2.0.1 tornado 6.4.2
traitlets 5.14.3 typeguard 4.3.0 types-python-dateutil 2.9.0.20241206
typing_extensions 4.12.2 tzdata 2024.1 ujson 5.10.0
unattended-upgrades 0.1 uri-template 1.3.0 urllib3 2.3.0
uvicorn 0.34.2 virtualenv 20.29.3 wadllib 1.3.6
wcwidth 0.2.5 webcolors 24.11.1 webencodings 0.5.1
websocket-client 1.8.0 whatthepatch 1.0.2 wheel 0.45.1
widgetsnbextension 3.6.6 wrapt 1.17.0 yapf 0.40.2
zipp 3.21.0

Installed R libraries

R libraries are installed from the Posit Package Manager CRAN snapshot on 2025-03-20.

Library Version Library Version Library Version
arrow 19.0.1 askpass 1.2.1 assertthat 0.2.1
backports 1.5.0 base 4.4.2 base64enc 0.1-3
bigD 0.3.0 bit 4.6.0 bit64 4.6.0-1
bitops 1.0-9 blob 1.2.4 boot 1.3-30
brew 1.0-10 brio 1.1.5 broom 1.0.7
bslib 0.9.0 cachem 1.1.0 callr 3.7.6
caret 7.0-1 cellranger 1.1.0 chron 2.3-62
class 7.3-22 cli 3.6.4 clipr 0.8.0
clock 0.7.2 cluster 2.1.6 codetools 0.2-20
colorspace 2.1-1 commonmark 1.9.5 compiler 4.4.2
config 0.3.2 conflicted 1.2.0 cpp11 0.5.2
crayon 1.5.3 credentials 2.0.2 curl 6.2.1
data.table 1.17.0 datasets 4.4.2 DBI 1.2.3
dbplyr 2.5.0 desc 1.4.3 devtools 2.4.5
diagram 1.6.5 diffobj 0.3.5 digest 0.6.37
downlit 0.4.4 dplyr 1.1.4 dtplyr 1.3.1
e1071 1.7-16 ellipsis 0.3.2 evaluate 1.0.3
fansi 1.0.6 farver 2.1.2 fastmap 1.2.0
fontawesome 0.5.3 forcats 1.0.0 foreach 1.5.2
foreign 0.8-86 forge 0.2.0 fs 1.6.5
future 1.34.0 future.apply 1.11.3 gargle 1.5.2
generics 0.1.3 gert 2.1.4 ggplot2 3.5.1
gh 1.4.1 git2r 0.35.0 gitcreds 0.1.2
glmnet 4.1-8 globals 0.16.3 glue 1.8.0
googledrive 2.1.1 googlesheets4 1.1.1 gower 1.0.2
graphics 4.4.2 grDevices 4.4.2 grid 4.4.2
gridExtra 2.3 gsubfn 0.7 gt 0.11.1
gtable 0.3.6 hardhat 1.4.1 haven 2.5.4
highr 0.11 hms 1.1.3 htmltools 0.5.8.1
htmlwidgets 1.6.4 httpuv 1.6.15 httr 1.4.7
httr2 1.1.1 ids 1.0.1 ini 0.3.1
ipred 0.9-15 isoband 0.2.7 iterators 1.0.14
jquerylib 0.1.4 jsonlite 1.9.1 juicyjuice 0.1.0
KernSmooth 2.23-22 knitr 1.50 labeling 0.4.3
later 1.4.1 lattice 0.22-5 lava 1.8.1
lifecycle 1.0.4 listenv 0.9.1 lubridate 1.9.4
magrittr 2.0.3 markdown 1.13 MASS 7.3-60.0.1
Matrix 1.6-5 memoise 2.0.1 methods 4.4.2
mgcv 1.9-1 mime 0.13 miniUI 0.1.1.1
mlflow 2.20.4 ModelMetrics 1.2.2.2 modelr 0.1.11
munsell 0.5.1 nlme 3.1-164 nnet 7.3-19
numDeriv 2016.8-1.1 openssl 2.3.2 parallel 4.4.2
parallelly 1.42.0 pillar 1.10.1 pkgbuild 1.4.6
pkgconfig 2.0.3 pkgdown 2.1.1 pkgload 1.4.0
plogr 0.2.0 plyr 1.8.9 praise 1.0.0
prettyunits 1.2.0 pROC 1.18.5 processx 3.8.6
prodlim 2024.06.25 profvis 0.4.0 progress 1.2.3
progressr 0.15.1 promises 1.3.2 proto 1.0.0
proxy 0.4-27 ps 1.9.0 purrr 1.0.4
R6 2.6.1 ragg 1.3.3 randomForest 4.7-1.2
rappdirs 0.3.3 rcmdcheck 1.4.0 RColorBrewer 1.1-3
Rcpp 1.0.14 RcppEigen 0.3.4.0.2 reactable 0.4.4
reactR 0.6.1 readr 2.1.5 readxl 1.4.5
recipes 1.2.0 rematch 2.0.0 rematch2 2.1.2
remotes 2.5.0 reprex 2.1.1 reshape2 1.4.4
rlang 1.1.5 rmarkdown 2.29 RODBC 1.3-26
roxygen2 7.3.2 rpart 4.1.23 rprojroot 2.0.4
Rserve 1.8-15 RSQLite 2.3.9 rstudioapi 0.17.1
rversions 2.1.2 rvest 1.0.4 sass 0.4.9
scales 1.3.0 selectr 0.4-2 sessioninfo 1.2.3
shape 1.4.6.1 shiny 1.10.0 sourcetools 0.1.7-1
sparklyr 1.9.0 SparkR 4.0.0 sparsevctrs 0.3.1
spatial 7.3-17 splines 4.4.2 sqldf 0.4-11
SQUAREM 2021.1 stats 4.4.2 stats4 4.4.2
stringi 1.8.4 stringr 1.5.1 survival 3.5-8
swagger 5.17.14.1 sys 3.4.3 systemfonts 1.2.1
tcltk 4.4.2 testthat 3.2.3 textshaping 1.0.0
tibble 3.2.1 tidyr 1.3.1 tidyselect 1.2.1
tidyverse 2.0.0 timechange 0.3.0 timeDate 4041.110
tinytex 0.56 tools 4.4.2 tzdb 0.5.0
urlchecker 1.0.1 usethis 3.1.0 utf8 1.2.4
utils 4.4.2 uuid 1.2-1 V8 6.0.2
vctrs 0.6.5 viridisLite 0.4.2 vroom 1.6.5
waldo 0.6.1 whisker 0.4.1 withr 3.0.2
xfun 0.51 xml2 1.3.8 xopen 1.0.1
xtable 1.8-4 yaml 2.3.10 zeallot 0.1.0
zip 2.3.2

Installed Java and Scala libraries (Scala 2.13 cluster version)

Group ID Artifact ID Version
antlr antlr 2.7.7
com.amazonaws amazon-kinesis-client 1.12.0
com.amazonaws aws-java-sdk-autoscaling 1.12.638
com.amazonaws aws-java-sdk-cloudformation 1.12.638
com.amazonaws aws-java-sdk-cloudfront 1.12.638
com.amazonaws aws-java-sdk-cloudhsm 1.12.638
com.amazonaws aws-java-sdk-cloudsearch 1.12.638
com.amazonaws aws-java-sdk-cloudtrail 1.12.638
com.amazonaws aws-java-sdk-cloudwatch 1.12.638
com.amazonaws aws-java-sdk-cloudwatchmetrics 1.12.638
com.amazonaws aws-java-sdk-codedeploy 1.12.638
com.amazonaws aws-java-sdk-cognitoidentity 1.12.638
com.amazonaws aws-java-sdk-cognitosync 1.12.638
com.amazonaws aws-java-sdk-config 1.12.638
com.amazonaws aws-java-sdk-core 1.12.638
com.amazonaws aws-java-sdk-datapipeline 1.12.638
com.amazonaws aws-java-sdk-directconnect 1.12.638
com.amazonaws aws-java-sdk-directory 1.12.638
com.amazonaws aws-java-sdk-dynamodb 1.12.638
com.amazonaws aws-java-sdk-ec2 1.12.638
com.amazonaws aws-java-sdk-ecs 1.12.638
com.amazonaws aws-java-sdk-efs 1.12.638
com.amazonaws aws-java-sdk-elasticache 1.12.638
com.amazonaws aws-java-sdk-elasticbeanstalk 1.12.638
com.amazonaws aws-java-sdk-elasticloadbalancing 1.12.638
com.amazonaws aws-java-sdk-elastictranscoder 1.12.638
com.amazonaws aws-java-sdk-emr 1.12.638
com.amazonaws aws-java-sdk-glacier 1.12.638
com.amazonaws aws-java-sdk-glue 1.12.638
com.amazonaws aws-java-sdk-iam 1.12.638
com.amazonaws aws-java-sdk-importexport 1.12.638
com.amazonaws aws-java-sdk-kinesis 1.12.638
com.amazonaws aws-java-sdk-kms 1.12.638
com.amazonaws aws-java-sdk-lambda 1.12.638
com.amazonaws aws-java-sdk-logs 1.12.638
com.amazonaws aws-java-sdk-machinelearning 1.12.638
com.amazonaws aws-java-sdk-opsworks 1.12.638
com.amazonaws aws-java-sdk-rds 1.12.638
com.amazonaws aws-java-sdk-redshift 1.12.638
com.amazonaws aws-java-sdk-route53 1.12.638
com.amazonaws aws-java-sdk-s3 1.12.638
com.amazonaws aws-java-sdk-ses 1.12.638
com.amazonaws aws-java-sdk-simpledb 1.12.638
com.amazonaws aws-java-sdk-simpleworkflow 1.12.638
com.amazonaws aws-java-sdk-sns 1.12.638
com.amazonaws aws-java-sdk-sqs 1.12.638
com.amazonaws aws-java-sdk-ssm 1.12.638
com.amazonaws aws-java-sdk-storagegateway 1.12.638
com.amazonaws aws-java-sdk-sts 1.12.638
com.amazonaws aws-java-sdk-support 1.12.638
com.amazonaws aws-java-sdk-swf-libraries 1.11.22
com.amazonaws aws-java-sdk-workspaces 1.12.638
com.amazonaws jmespath-java 1.12.638
com.clearspring.analytics stream 2.9.8
com.databricks Rserve 1.8-3
com.databricks databricks-sdk-java 0.27.0
com.databricks jets3t 0.7.1-0
com.databricks.scalapb scalapb-runtime_2.13 0.4.15-11
com.esotericsoftware kryo-shaded 4.0.3
com.esotericsoftware minlog 1.3.0
com.fasterxml classmate 1.5.1
com.fasterxml.jackson.core jackson-annotations 2.18.2
com.fasterxml.jackson.core jackson-core 2.18.2
com.fasterxml.jackson.core jackson-databind 2.18.2
com.fasterxml.jackson.dataformat jackson-dataformat-cbor 2.18.2
com.fasterxml.jackson.dataformat jackson-dataformat-yaml 2.15.2
com.fasterxml.jackson.datatype jackson-datatype-joda 2.18.2
com.fasterxml.jackson.datatype jackson-datatype-jsr310 2.18.2
com.fasterxml.jackson.module jackson-module-paranamer 2.18.2
com.fasterxml.jackson.module jackson-module-scala_2.13 2.18.2
com.github.ben-manes.caffeine caffeine 2.9.3
com.github.blemale scaffeine_2.13 4.1.0
com.github.fommil jniloader 1.1
com.github.fommil.netlib native_ref-java 1.1
com.github.fommil.netlib native_ref-java 1.1-natives
com.github.fommil.netlib native_system-java 1.1
com.github.fommil.netlib native_system-java 1.1-natives
com.github.fommil.netlib netlib-native_ref-linux-x86_64 1.1-natives
com.github.fommil.netlib netlib-native_system-linux-x86_64 1.1-natives
com.github.luben zstd-jni 1.5.6-10
com.github.virtuald curvesapi 1.08
com.github.wendykierp JTransforms 3.1
com.google.api.grpc proto-google-common-protos 2.5.1
com.google.code.findbugs jsr305 3.0.0
com.google.code.gson gson 2.11.0
com.google.crypto.tink tink 1.16.0
com.google.errorprone error_prone_annotations 2.36.0
com.google.flatbuffers flatbuffers-java 24.3.25
com.google.guava failureaccess 1.0.2
com.google.guava guava 33.4.0-jre
com.google.guava listenablefuture 9999.0-empty-to-avoid-conflict-with-guava
com.google.j2objc j2objc-annotations 3.0.0
com.google.protobuf protobuf-java 3.25.5
com.google.protobuf protobuf-java-util 3.25.5
com.helger profiler 1.1.1
com.ibm.icu icu4j 75.1
com.jcraft jsch 0.1.55
com.lihaoyi sourcecode_2.13 0.1.9
com.microsoft.azure azure-data-lake-store-sdk 2.3.10
com.microsoft.sqlserver mssql-jdbc 12.8.0.jre11
com.microsoft.sqlserver mssql-jdbc 12.8.0.jre8
com.ning compress-lzf 1.1.2
com.sun.mail javax.mail 1.5.2
com.sun.xml.bind jaxb-core 2.2.11
com.sun.xml.bind jaxb-impl 2.2.11
com.tdunning json 1.8
com.thoughtworks.paranamer paranamer 2.8
com.trueaccord.lenses lenses_2.13 0.4.13
com.twitter chill-java 0.10.0
com.twitter chill_2.13 0.10.0
com.twitter util-app_2.13 19.8.1
com.twitter util-core_2.13 19.8.1
com.twitter util-function_2.13 19.8.1
com.twitter util-jvm_2.13 19.8.1
com.twitter util-lint_2.13 19.8.1
com.twitter util-registry_2.13 19.8.1
com.twitter util-stats_2.13 19.8.1
com.typesafe config 1.4.3
com.typesafe.scala-logging scala-logging_2.13 3.9.2
com.uber h3 3.7.3
com.univocity univocity-parsers 2.9.1
com.zaxxer HikariCP 4.0.3
com.zaxxer SparseBitSet 1.3
commons-cli commons-cli 1.9.0
commons-codec commons-codec 1.17.2
commons-collections commons-collections 3.2.2
commons-dbcp commons-dbcp 1.4
commons-fileupload commons-fileupload 1.5
commons-httpclient commons-httpclient 3.1
commons-io commons-io 2.18.0
commons-lang commons-lang 2.6
commons-logging commons-logging 1.1.3
commons-pool commons-pool 1.5.4
dev.ludovic.netlib arpack 3.0.3
dev.ludovic.netlib blas 3.0.3
dev.ludovic.netlib lapack 3.0.3
info.ganglia.gmetric4j gmetric4j 1.0.10
io.airlift aircompressor 2.0.2
io.delta delta-sharing-client_2.13 1.3.0
io.dropwizard.metrics metrics-annotation 4.2.30
io.dropwizard.metrics metrics-core 4.2.30
io.dropwizard.metrics metrics-graphite 4.2.30
io.dropwizard.metrics metrics-healthchecks 4.2.30
io.dropwizard.metrics metrics-jetty9 4.2.30
io.dropwizard.metrics metrics-jmx 4.2.30
io.dropwizard.metrics metrics-json 4.2.30
io.dropwizard.metrics metrics-jvm 4.2.30
io.dropwizard.metrics metrics-servlets 4.2.30
io.github.java-diff-utils java-diff-utils 4.15
io.netty netty-all 4.1.118.Final
io.netty netty-buffer 4.1.118.Final
io.netty netty-codec 4.1.118.Final
io.netty netty-codec-http 4.1.118.Final
io.netty netty-codec-http2 4.1.118.Final
io.netty netty-codec-socks 4.1.118.Final
io.netty netty-common 4.1.118.Final
io.netty netty-handler 4.1.118.Final
io.netty netty-handler-proxy 4.1.118.Final
io.netty netty-resolver 4.1.118.Final
io.netty netty-tcnative-boringssl-static 2.0.70.Final-db-r0-linux-aarch_64
io.netty netty-tcnative-boringssl-static 2.0.70.Final-db-r0-linux-x86_64
io.netty netty-tcnative-boringssl-static 2.0.70.Final-db-r0-osx-aarch_64
io.netty netty-tcnative-boringssl-static 2.0.70.Final-db-r0-osx-x86_64
io.netty netty-tcnative-boringssl-static 2.0.70.Final-db-r0-windows-x86_64
io.netty netty-tcnative-classes 2.0.70.Final
io.netty netty-transport 4.1.118.Final
io.netty netty-transport-classes-epoll 4.1.118.Final
io.netty netty-transport-classes-kqueue 4.1.118.Final
io.netty netty-transport-native-epoll 4.1.118.Final
io.netty netty-transport-native-epoll 4.1.118.Final-linux-aarch_64
io.netty netty-transport-native-epoll 4.1.118.Final-linux-riscv64
io.netty netty-transport-native-epoll 4.1.118.Final-linux-x86_64
io.netty netty-transport-native-kqueue 4.1.118.Final-osx-aarch_64
io.netty netty-transport-native-kqueue 4.1.118.Final-osx-x86_64
io.netty netty-transport-native-unix-common 4.1.118.Final
io.prometheus simpleclient 0.16.1-databricks
io.prometheus simpleclient_common 0.16.1-databricks
io.prometheus simpleclient_dropwizard 0.16.1-databricks
io.prometheus simpleclient_pushgateway 0.16.1-databricks
io.prometheus simpleclient_servlet 0.16.1-databricks
io.prometheus simpleclient_servlet_common 0.16.1-databricks
io.prometheus simpleclient_tracer_common 0.16.1-databricks
io.prometheus simpleclient_tracer_otel 0.16.1-databricks
io.prometheus simpleclient_tracer_otel_agent 0.16.1-databricks
io.prometheus.jmx collector 0.18.0
jakarta.annotation jakarta.annotation-api 1.3.5
jakarta.servlet jakarta.servlet-api 4.0.3
jakarta.validation jakarta.validation-api 2.0.2
jakarta.ws.rs jakarta.ws.rs-api 2.1.6
javax.activation activation 1.1.1
javax.annotation javax.annotation-api 1.3.2
javax.el javax.el-api 2.2.4
javax.jdo jdo-api 3.0.1
javax.transaction jta 1.1
javax.transaction transaction-api 1.1
javax.xml.bind jaxb-api 2.2.11
javolution javolution 5.5.1
jline jline 2.14.6
joda-time joda-time 2.13.0
net.java.dev.jna jna 5.8.0
net.razorvine pickle 1.5
net.sf.jpam jpam 1.1
net.sf.opencsv opencsv 2.3
net.sf.supercsv super-csv 2.2.0
net.snowflake snowflake-ingest-sdk 0.9.6
net.sourceforge.f2j arpack_combined_all 0.1
org.acplt.remotetea remotetea-oncrpc 1.1.2
org.antlr ST4 4.0.4
org.antlr antlr-runtime 3.5.2
org.antlr antlr4-runtime 4.13.1
org.antlr stringtemplate 3.2.1
org.apache.ant ant 1.10.11
org.apache.ant ant-jsch 1.10.11
org.apache.ant ant-launcher 1.10.11
org.apache.arrow arrow-format 18.2.0
org.apache.arrow arrow-memory-core 18.2.0
org.apache.arrow arrow-memory-netty 18.2.0
org.apache.arrow arrow-memory-netty-buffer-patch 18.2.0
org.apache.arrow arrow-vector 18.2.0
org.apache.avro avro 1.12.0
org.apache.avro avro-ipc 1.12.0
org.apache.avro avro-mapred 1.12.0
org.apache.commons commons-collections4 4.4
org.apache.commons commons-compress 1.27.1
org.apache.commons commons-crypto 1.1.0
org.apache.commons commons-lang3 3.17.0
org.apache.commons commons-math3 3.6.1
org.apache.commons commons-text 1.13.0
org.apache.curator curator-client 5.7.1
org.apache.curator curator-framework 5.7.1
org.apache.curator curator-recipes 5.7.1
org.apache.datasketches datasketches-java 6.1.1
org.apache.datasketches datasketches-memory 3.0.2
org.apache.derby derby 10.14.2.0
org.apache.hadoop hadoop-client-runtime 3.4.1
org.apache.hive hive-beeline 2.3.10
org.apache.hive hive-cli 2.3.10
org.apache.hive hive-jdbc 2.3.10
org.apache.hive hive-llap-client 2.3.10
org.apache.hive hive-llap-common 2.3.10
org.apache.hive hive-serde 2.3.10
org.apache.hive hive-shims 2.3.10
org.apache.hive hive-storage-api 2.8.1
org.apache.hive.shims hive-shims-0.23 2.3.10
org.apache.hive.shims hive-shims-common 2.3.10
org.apache.hive.shims hive-shims-scheduler 2.3.10
org.apache.httpcomponents httpclient 4.5.14
org.apache.httpcomponents httpcore 4.4.16
org.apache.ivy ivy 2.5.3
org.apache.logging.log4j log4j-1.2-api 2.24.3
org.apache.logging.log4j log4j-api 2.24.3
org.apache.logging.log4j log4j-core 2.24.3
org.apache.logging.log4j log4j-layout-template-json 2.24.3
org.apache.logging.log4j log4j-slf4j2-impl 2.24.3
org.apache.orc orc-core 2.1.1-shaded-protobuf
org.apache.orc orc-format 1.1.0-shaded-protobuf
org.apache.orc orc-mapreduce 2.1.1-shaded-protobuf
org.apache.orc orc-shims 2.1.1
org.apache.poi poi 5.4.1
org.apache.poi poi-ooxml 5.4.1
org.apache.poi poi-ooxml-full 5.4.1
org.apache.poi poi-ooxml-lite 5.4.1
org.apache.thrift libfb303 0.9.3
org.apache.thrift libthrift 0.16.0
org.apache.ws.xmlschema xmlschema-core 2.3.1
org.apache.xbean xbean-asm9-shaded 4.26
org.apache.xmlbeans xmlbeans 5.3.0
org.apache.yetus audience-annotations 0.13.0
org.apache.zookeeper zookeeper 3.9.3
org.apache.zookeeper zookeeper-jute 3.9.3
org.checkerframework checker-qual 3.43.0
org.codehaus.janino commons-compiler 3.0.16
org.codehaus.janino janino 3.0.16
org.datanucleus datanucleus-api-jdo 4.2.4
org.datanucleus datanucleus-core 4.1.17
org.datanucleus datanucleus-rdbms 4.1.19
org.datanucleus javax.jdo 3.2.0-m3
org.eclipse.jetty jetty-client 9.4.53.v20231009
org.eclipse.jetty jetty-continuation 9.4.53.v20231009
org.eclipse.jetty jetty-http 9.4.53.v20231009
org.eclipse.jetty jetty-io 9.4.53.v20231009
org.eclipse.jetty jetty-jndi 9.4.53.v20231009
org.eclipse.jetty jetty-plus 9.4.53.v20231009
org.eclipse.jetty jetty-proxy 9.4.53.v20231009
org.eclipse.jetty jetty-security 9.4.53.v20231009
org.eclipse.jetty jetty-server 9.4.53.v20231009
org.eclipse.jetty jetty-servlet 9.4.53.v20231009
org.eclipse.jetty jetty-servlets 9.4.53.v20231009
org.eclipse.jetty jetty-util 9.4.53.v20231009
org.eclipse.jetty jetty-util-ajax 9.4.53.v20231009
org.eclipse.jetty jetty-webapp 9.4.53.v20231009
org.eclipse.jetty jetty-xml 9.4.53.v20231009
org.eclipse.jetty.websocket websocket-api 9.4.53.v20231009
org.eclipse.jetty.websocket websocket-client 9.4.53.v20231009
org.eclipse.jetty.websocket websocket-common 9.4.53.v20231009
org.eclipse.jetty.websocket websocket-server 9.4.53.v20231009
org.eclipse.jetty.websocket websocket-servlet 9.4.53.v20231009
org.fusesource.leveldbjni leveldbjni-all 1.8
org.glassfish.hk2 hk2-api 2.6.1
org.glassfish.hk2 hk2-locator 2.6.1
org.glassfish.hk2 hk2-utils 2.6.1
org.glassfish.hk2 osgi-resource-locator 1.0.3
org.glassfish.hk2.external aopalliance-repackaged 2.6.1
org.glassfish.hk2.external jakarta.inject 2.6.1
org.glassfish.jersey.containers jersey-container-servlet 2.41
org.glassfish.jersey.containers jersey-container-servlet-core 2.41
org.glassfish.jersey.core jersey-client 2.41
org.glassfish.jersey.core jersey-common 2.41
org.glassfish.jersey.core jersey-server 2.41
org.glassfish.jersey.inject jersey-hk2 2.41
org.hibernate.validator hibernate-validator 6.2.5.Final
org.ini4j ini4j 0.5.4
org.javassist javassist 3.29.2-GA
org.jboss.logging jboss-logging 3.4.1.Final
org.jdbi jdbi 2.63.1
org.jetbrains annotations 17.0.0
org.jline jline 3.27.1-jdk8
org.joda joda-convert 1.7
org.jodd jodd-core 3.5.2
org.json4s json4s-ast_2.13 4.0.7
org.json4s json4s-core_2.13 4.0.7
org.json4s json4s-jackson-core_2.13 4.0.7
org.json4s json4s-jackson_2.13 4.0.7
org.json4s json4s-scalap_2.13 4.0.7
org.lz4 lz4-java 1.8.0-databricks-1
org.mlflow mlflow-spark_2.13 2.9.1
org.objenesis objenesis 3.3
org.postgresql postgresql 42.6.1
org.roaringbitmap RoaringBitmap 1.2.1
org.rocksdb rocksdbjni 9.8.4
org.rosuda.REngine REngine 2.1.0
org.scala-lang scala-compiler_2.13 2.13.16
org.scala-lang scala-library_2.13 2.13.16
org.scala-lang scala-reflect_2.13 2.13.16
org.scala-lang.modules scala-collection-compat_2.13 2.11.0
org.scala-lang.modules scala-java8-compat_2.13 0.9.1
org.scala-lang.modules scala-parallel-collections_2.13 1.2.0
org.scala-lang.modules scala-parser-combinators_2.13 2.4.0
org.scala-lang.modules scala-xml_2.13 2.3.0
org.scala-sbt test-interface 1.0
org.scalacheck scalacheck_2.13 1.18.0
org.scalactic scalactic_2.13 3.2.19
org.scalanlp breeze-macros_2.13 2.1.0
org.scalanlp breeze_2.13 2.1.0
org.scalatest scalatest-compatible 3.2.19
org.scalatest scalatest-core_2.13 3.2.19
org.scalatest scalatest-diagrams_2.13 3.2.19
org.scalatest scalatest-featurespec_2.13 3.2.19
org.scalatest scalatest-flatspec_2.13 3.2.19
org.scalatest scalatest-freespec_2.13 3.2.19
org.scalatest scalatest-funspec_2.13 3.2.19
org.scalatest scalatest-funsuite_2.13 3.2.19
org.scalatest scalatest-matchers-core_2.13 3.2.19
org.scalatest scalatest-mustmatchers_2.13 3.2.19
org.scalatest scalatest-propspec_2.13 3.2.19
org.scalatest scalatest-refspec_2.13 3.2.19
org.scalatest scalatest-shouldmatchers_2.13 3.2.19
org.scalatest scalatest-wordspec_2.13 3.2.19
org.scalatest scalatest_2.13 3.2.19
org.slf4j jcl-over-slf4j 2.0.16
org.slf4j jul-to-slf4j 2.0.16
org.slf4j slf4j-api 2.0.16
org.slf4j slf4j-simple 1.7.25
org.threeten threeten-extra 1.8.0
org.tukaani xz 1.10
org.typelevel algebra_2.13 2.8.0
org.typelevel cats-kernel_2.13 2.8.0
org.typelevel spire-macros_2.13 0.18.0
org.typelevel spire-platform_2.13 0.18.0
org.typelevel spire-util_2.13 0.18.0
org.typelevel spire_2.13 0.18.0
org.wildfly.openssl wildfly-openssl 1.1.3.Final
org.xerial sqlite-jdbc 3.42.0.0
org.xerial.snappy snappy-java 1.1.10.3
org.yaml snakeyaml 2.0
oro oro 2.0.8
pl.edu.icm JLargeArrays 1.5
software.amazon.cryptools AmazonCorrettoCryptoProvider 2.4.1-linux-x86_64
stax stax-api 1.0.1

Tip

To see release notes for Databricks Runtime versions that have reached end-of-support (EoS), see End-of-support Databricks Runtime release notes. The EoS Databricks Runtime versions have been retired and might not be updated.