Starburst Delta Lake connector#

The Starburst Delta Lake connector is an extended version of the Delta Lake connector with configuration and usage identical.

The following improvements are included:

Requirements#

To connect to Databricks Delta Lake, you need:

Extensions#

The connector includes all the functionality described in the Delta Lake connector as well as these features:

Configuration#

The configuration of the Starburst Delta Lake connector is identical to the Delta Lake connector, with the addition of the following catalog session properties for the parquet writer:

Parquet catalog session properties#

Property name

Description

parquet_optimized_writer_enabled

Enables the experimental, native Parquet writer.

SQL support#

The connector supports all of the SQL statements listed in the Delta Lake connector documentation.

The following improvements are included:

SQL security#

You must set the delta.security property in your catalog properties file to sql-standard in order to use SQL security operation statements. See SQL standard based authorization for more information.

Performance#

The connector includes a number of performance improvements, detailed in the following sections:

Dynamic row filtering#

Dynamic filtering, and specifically also dynamic row filtering, is enabled by default. Row filtering improves the effectiveness of dynamic filtering for a connector by using dynamic filters to remove unnecessary rows during a table scan. It is especially powerful for selective filters on columns that are not used for partitioning, bucketing, or when the values do not appear in any clustered order naturally.

As a result the amount of data read from storage and transferred across the network is further reduced. You get access to higher query performance and a reduced cost.

You can use the following properties to configure dynamic row filtering:

Dynamic row filtering properties#

Property name

Description

dynamic-row-filtering.enabled

Toggle dynamic row filtering. Defaults to true. Catalog session property name is dynamic_row_filtering_enabled.

dynamic-row-filtering.selectivity-threshold

Control the threshold for the fraction of the selected rows from the overall table above which dynamic row filters are not used. Defaults to 0.7. Catalog session property name is dynamic_row_filtering_selectivity_threshold.

dynamic-row-filtering.wait-timeout

Duration to wait for completion of dynamic row filtering. Defaults to 0. The default causes query processing to proceed without waiting for the dynamic row filter, it is collected asynchronously and used as soon as it becomes available. Catalog session property name is dynamic_row_filtering_wait_timeout.

Starburst Cached Views#

The connector supports table scan redirection to improve performance and reduce load on the data source.

Security#

The connector includes a number of security-related features, detailed in the following sections.

Authorization#

The connector supports standard Hive security for authorization under the delta.security configuration property. For more information, see the Delta Lake connector authorization configuration options.

Built-in access control#

If you have enabled built-in access control for SEP, you must add the following configuration to all Delta Lake catalogs:

delta.security=starburst