Tuning your cluster performance#
Dell Data Analytics Engine, powered by Starburst Enterprise platform (SEP) is a more feature-rich version of Trino (formerly PrestoSQL) providing enhanced query performance, security, connectivity, and ease of use.
Learn how to size your cluster and the machines in it to ensure the best performance possible for your workload in this training video presented by one of our founders, Dain Sundstrom. For your convenience, we’ve divided the video training course up into topic sections, and provided links to the relevant parts of our documentation below.
General tuning strategy & baseline advice#
Running time: ~9 min.
Topics: |
---|
Starting big. |
Stabilizing, then tuning. |
Options to disable. |
Cluster sizing, and how SEP uses CPU and memory resources#
Running time: ~19 min.
Topics: |
---|
How memory affects |
Availability. |
Concurrency. |
Machine sizing and its impact#
Running time: ~38 min.
Topics: |
---|
Memory and memory allocation. |
Shared join hash. |
Distributed join. |
Skew. |
Machine sizes and types. |
Spilling. |
Small clusters. |
Additional resources on resources management and spilling in SEP:
Tuning the workload#
Running time: ~16 min.
Topics: |
---|
Precomputing. |
Connectors. |
Hive data organization#
Running time: ~16 min.
Organize your data for the Hive connector. |
---|
Hive partitioning and bucketing. |
ORC and Parquet. |
File size. |
Bad parquet files. |
Rewrite table with the ORC writer. |
Making queries faster#
Running time: ~13 min.
Topics: |
---|
What to look for in a query. |
Using more hardware. |
Underutilization. |
Hive caching. |
For more in-depth information on this topic, watch our query optimization training videos.