Release 429-e LTS (29 Nov 2023)#
The 429-e release includes all improvements from the following Trino releases:
Highlights since 423-e#
Added support for PyStarburst.
Added read-only public preview support for Unity Catalog as a metastore.
Added support for CREATE OR REPLACE TABLE statements in Delta Lake.
Added predicate pushdown support to the MongoDB connector.
Breaking changes since 423-e#
The SEP backend service has been updated to require PostgreSQL 12.0+ when using PostgreSQL as the underlying RDBMS.
TIMESTAMP
type mapping between MySQL and Trino is no longerTIMESTAMP
toTIMESTAMP
. The new conversion is MySQLTIMESTAMP
to TrinoTIMESTAMP WITH TIMEZONE
. Depending on the query, mapping from MySQLTIMESTAMP
to TrinoTIMESTAMP
may result in an error message.The
deprecated.hive.metastore.glue-read-properties-based-column-statistics
Hive Metastore configuration property and underlying functionality has been removed. You must remove this configuration property or the cluster fails to start.The updated base Docker image for SEP no longer includes
curl
,vi
,nano
,sed
,awk
,grep
, and other popular command line tools. Starburst recommends using an init container with a base image that includes your needed command line tools. Guidance on using init containers and selecting suitable base images can be found in our init container documentation.SEP 427-e uses a base system image that does not contain a system-wide trust store. Trusted, self-signed certificates must now be added to the Java distribution CA certificates located under
$JAVA_HOME/lib/security/cacerts
.The legacy
parse-decimal-literals-as-double
configuration property has been removed. Clusters that use this property must have it removed from configuration or the cluster does not start.The following deprecated task writer configuration properties have been removed:
task.writer-count
, replaced byprop-task-min-writer-count
.task.partitioned-writer-count
, replaced byprop-task-max-writer-count
.task.scale-writers.max-writer-count
, replaced byprop-task-max-writer-count
.writer-min-size
, replaced bywriter-scaling-min-data-processed
.
You must remove these properties from the cluster configuration and replace them with these replacement properties, or the cluster does not start.
The Snowflake distributed connector is now deprecated and is planned to be removed in a future SEP release, in favor of the improved Snowflake parallel connector. Existing catalogs that use the Snowflake distributed connector must be migrated to the Snowflake parallel connector.
The RPM package
service
daemon script is now deprecated and is planned to be removed in a future SEP release. Configurations that rely on this script must be updated to use thesystemctl
daemon script instead.As of the 429-e release, table functions such as
query
are qualified with thesystem.builtin
schema. This change results inAccess Denied
errors in circumstances where a role was granted a privilege to execute table functions not qualified under a schema. Permissions for these roles must now be updated accordingly to execute table functions when qualified with an appropriate schema.
429-e initial changes#
General#
Added support for publishing data products that contain decimal literals.
Updated usage metrics to upload data collected between previous upload and the coordinator shutdown or restart.
Fixed issue that prevented the Run and troubleshoot option in the query editor from working when built-in access control is enabled.
Security#
Added session logout to OAuth 2.0 providers when logging out from the SEP web UI.
Fixed issue that prevented tables and columns inside
information_schema
from being displayed when built-in access control is used.Fixed JavaScript policy evaluation in Privacera.
Hive connector#
Added support for flushing the filesystem cache for tables with the
flush_filesystem_cache
system procedure.
Delta Lake connector#
Added support for CREATE OR REPLACE TABLE statements.
MongoDB connector#
Added predicate pushdown support.
Snowflake connector#
Updated connectors to use fully parallel mode by default for more query shapes.
SQL Server connector#
Added the
sqlserver.database-prefix-for-schema.enabled
catalog configuration property that allows SQL Server catalogs to access multiple databases.
429-e.0 changes (29 Nov 2023)#
Improved support for concurrent updates of table statistics in Glue.
Added masking for additional sensitive values in log files.
Added casting of
char
fields, if necessary, tovarchar
type in Hive view translations.Added support for
RENAME SCHEMA
andRENAME TABLE
when thesnowflake.database-prefix-for-schema.enabled
property is set totrue
.Remediated CVE-2023-41900
Fixed incorrect results for queries involving an aggregation in a correlated subquery.
Fixed incorrect results for queries involving
ORDER BY
and window functions with ordered frames.Fixed launcher start command not working with default directories.
Fixed possible JVM crash when reading short decimal columns in parquet files created by Impala. Applies to the Hive, Hudi, Delta, and Iceberg connectors.
Fixed incorrect results when a query contains several
!=
orNOT IN
predicates in MongoDB catalogs.
429-e.1 changes (21 Dec 2023)#
Improved query planning time on Hive tables without statistics generated.
Fixed long query planning times for queries with many local exchanges.
Fixed query failure when reading parquet column index for timestamped columns in Hive, Delta, Iceberg, and Hudi tables.
Fixed incorrect results for
LIKE
with some strings containing repeated substrings.Fixed coordinator memory leak.
429-e.2 changes (18 Jan 2024)#
Fixed incorrect results on parquet files containing page indexes when the query has filters on multiple columns in Hive, Delta, and Hudi tables.
Fixed an issue with the
Run and troubleshoot
Run button option writing to empty directories without the option being selected.
429-e.3 changes (14 Feb 2024)#
Fixed Teradata custom dates format.
Fixed query failure when reading array columns.
Fixed a bug where an entire directory is skipped from schema discovery if at least one file matched the
excludePatterns
option.Fixed out-of-bound (OOB) telemetry null pointer exception in parallel Snowflake connector.
Fixed complex expression pushdown in the Redshift connector.
Fixed a bug where query history displayed queries of another user.
429-e.4 changes (11 Mar 2024)#
Updated Kubernetes external secret operator.
Fixed UI authentication for large authentication tokens.
Fixed incorrect results for
DATETIMEOFFSET
values before the year 1400.Fixed query failure when using
char
types with thereverse()
function.Fixed potential incorrect results when using the
ST_Centroid()
andST_Buffer()
functions for tiny geometries.Fixed schema, table, and function visibility in BIAC filtering.
Fixed a bug where column statistics created in SEP would not be visible in Hive when using CDP 7.
429-e.5 changes (28 Mar 2024)#
Fixed an issue which caused the
sync_partition_metadata
operation to fail when partition paths had case changes.Restored support for
SymlinkTextInputFormat
for text formats.Fixed reading Delta Lake files with encoded characters on Azure.
Fixed failure when reading certain Avro data with
UNION
data types.
429-e.6 changes (17 Apr 2024)#
Enabled PyStarburst dataframe API by default.
Fixed possible worker crashes when running aggregation queries due to out-of-memory error.
Fixed incorrect results when querying a table being modified concurrently.
Fixed handling of union options in Hive and Avro to allow coercion to a single type.
Fixed a bug that caused the creation of materialized views to fail when using MySQL as the cache service backend database if
materialized_view_definitions
is longer than 64K characters.
429-e.7 changes (20 May 2024)#
Fixed potential query failure due to worker nodes running out of memory in concurrent scenarios.
Fixed incorrect result with deletion vector on Delta partitioned table.
Fixed correctness bug in constant literal distinct aggregation.
Fixed Prometheus whiteListObjectNames being overwritten when KEDA is enabled.
429-e.8 changes (14 Jun 2024)#
Fixed potential failure when reading ORC files larger than 2GB.
Fixed startup failure when fault-tolerant execution is enabled with Google Cloud Storage exchange.
Fixed potential loss of a query completion event when multiple queries fail at the same time.
Backported IMDSv2 service metadata access.
429-e.9 changes (28 Jun 2024)#
Fixed incorrect results when specifying a value for the
cassandra.partition-size-for-batch-select
configuration property.Fixed failure when writing to tables with Iceberg
VARBINARY
values.Fixed correctness issue on receivers refresh that could cause query hanging.
429-e.10 changes (11 Jul 2024)#
Added encoding to error code in OAuth2 callback handler.
Fixed reading empty files from S3 and GCS.
Fixed issue syncing partition metadata which could cause data deletion.
429-e.11 changes (29 Jul 2024)#
Fixed bug preventing use of Starburst security in Delta Lake connector.
429-e.12 changes (14 Aug 2024)#
Fixed optimizer timeout for certain queries involving aggregations and
CASE
expressions.Fixed failure when adding new columns with a decimal type.
Fixed failure to read Hive tables migrated to Iceberg with Apache Spark.
Fixed issue that caused the error ‘Multiple masks on a single column are not supported’ to occur unintentionally.
429-e.13 changes (30 Aug 2024)#
Fixed query failure when file-based network topology is configured with the
node-scheduler.network-topology.file
configuration property.
429-e.14 changes (13 Sep 2024)#
Fixed a bug that caused cluster metrics to be created with incorrect intervals and subsequently led to loss of cluster metrics data.
Fixed Run and troubleshoot feature when
insights.authorized-groups
configuration property contains authorized groups.Fixed numeric overflow during managed statistics computation for large tables in Teradata mode session.
429-e.15 was skipped.
429-e.16 changes (18 Oct 2024)#
Fixed OpenX JSON decoding a JSON array line that resulted in data being written to the wrong output column.
Fixed reading large Prometheus responses.
Fixed failures for
count(*)
queries with predicates containing non-ASCII strings. Applies to the Elasticsearch connector.
429-e.17 was skipped.
429-e.18 changes (4 Nov 2024)#
Use
hive.metastore.partition-batch-size.max
config property value insync_partition_metadata
procedure. The default batch size is changed to 100 from 1000.
429-e.19 changes (14 Nov 2024)#
Fixed memory leak in
InMemoryEventClient
within cache service.