Installing Starburst Enterprise in a Kubernetes cluster#

When all requirements have been met, you are ready to proceed with the initial installation or upgrade of Dell Data Analytics Engine, powered by Starburst Enterprise platform (SEP).

Before you begin

This topic assumes that you understand the following key concepts and components from other, related topics:

Familiarizing yourself with the concepts in the preceding list simplifies management of credentials, installation, and subsequent updates.

Other relevant topics to review before proceeding are as follows:

After you have completed the initial installation or upgrade as described in this topic, you are ready to begin fine-tuning your SEP configuration.

Overview#

The workflow to install SEP with Helm charts on your cluster follows is as follows:

  • Establish access to the Helm chart repository and configure credentials to access the Docker registry in a registry access YAML file for use in all of your Starburst k8s deployments.

  • Create a YAML file specific to each chart and cluster, for example sep-prod-setup.yaml for your production cluster.

  • Ensure your Helm/kubectl configuration points at the correct cluster with kubectl cluster-info.

  • Run Helm to install the chart.

  • Access the cluster and check for success.

Each Helm chart includes a values.yaml file that sets a reasonable set of default values. This default setup does not include catalog definitions and others that are necessary for your cluster.

You have to change your values file to add or update any configuration and run a Helm upgrade to apply the changes to the cluster.

Iterate on the configuration in the values YAML file with minimal setup until you have a working system. Depending on your cluster, you have to adjust memory requirements for the worker and coordinator and other settings. Inspect the message with kubectl or Octant to determine details.

After you’ve achieved a running cluster, ensure to store the values YAML file as a reference and then add more details as required for the specific cluster need.

Install SEP#

The SEP installation is managed with the starburst-enterprise Helm chart. Installation and upgrades are done with the helm upgrade command with the following options at a minimum:

  • the --install flag used with the helm upgrade command allows you to consistently use the same command to both install and upgrade SEP

  • a YAML file with your registry access credentials, discussed later in this topic

  • a minimal YAML configuration file with the memory and CPU resource configurations for coordinator: and worker: that reflect your cluster’s resources, for example sep-prod-setup.yaml

  • a YAML file with one or more catalogs defined, for example sep-prod-catalogs.yaml

The following example command assumes the registry access file, the production cluster configuration file, and the catalog configuration file are located in the current directory:

$ helm upgrade  my-sep starburstdata/starburst-enterprise \
    --install \
    --version 413.18.0 \
    --values ./registry-access.yaml
    --values ./sep-prod-setup.yaml
    --values ./sep-prod-catalogs.yaml

The version value is available from the Helm repository.

The default values result in one coordinator and two worker nodes in the cluster, with very specific memory and CPU resources that likely will not match your cluster’s specifications. We strongly suggest that you initially install SEP with the minimal changes needed to reflect your cluster’s memory and CPU resources; then make small, focused customizations to suit your organization’s needs.

The following sections describe the initial installation and how to begin customizing SEP.

Create this one file before you begin#

No matter what other configurations you need for your deployments, create your registry-access.yaml file for reuse across all clusters and charts. This helps to ensure a smooth install and deployment experience.

In your registry-access.yaml file, add the following to configure access for Starburst’s Harbor registry:

registryCredentials:
  enabled: true
  registry: harbor.starburstdata.net/starburstdata
  username: <yourusername>
  password: <yourpassword>

The contents of this file override the default, empty values and ensure that the Helm charts can download the required Docker containers.

If you have multiple clusters, this same file is used for all of them. You can also use the same file for the optional Ranger and Hive Metastore Service charts. Other configurations should be managed in separate files as per best practices.

As an alternative to using a username and password directly, you can use a Kubernetes secret by creating a secret containing the access token for your registry and configuring your pod to use the secret.

Using private registries and repositories instead of Starburst Harbor#

Typically, you must use your username and password for accessing the Helm chart repositories and the Docker registry on the Starburst Harbor instance. You can instead use private Docker and Helm chart repositories.

Initial installation checklist#

The following checklist describes the initial installation process:

  1. Gather repository credentials for the Helm chart repository and the Docker registry provided by Starburst Support.

  2. Create the registry-access.yaml file to override the default, empty values.

  3. Create your correctly-sized Kubernetes cluster.

  4. Ensure your Helm/kubectl configuration points at the correct cluster with kubectl cluster-info.

  5. Add your license file.

Note

We strongly suggest using a shared secret to add the license file.

  1. Create a minimal YAML configuration file with the memory and CPU resource configurations for coordinator: and worker: that reflect your cluster’s available resources, for example sep-prod-setup.yaml.

Warning

Do not skip this step. The default values for memory and CPU resources likely vary significantly from your cluster’s available resources. If you attempt to run SEP with the defaults, SEP may not start.

  1. Run Helm to install the default chart, as well as any override YAML files using the --values argument, as shown in the following example:

$ helm upgrade sep-prod-cluster starburstdata/starburst-enterprise \
    --install \
    --version 413.18.0 \
    --values ./registry-access.yaml \
    --values ./sep-prod-setup.yaml
  1. Determine the IP address or the DNS hostname of the coordinator by running the kubectl get pods command.

  2. Use the IP address or hostname to verify the coordinator is running by accessing the Starburst Enterprise web UI. You can use the same information to connect with the CLI or the JDBC driver.

Update to a new release#

If you have created focused, well-managed override files following our best practices guide, the upgrade process is a straightforward Helm-based process. As with any enterprise-scale application, we do recommend that you test upgrades from one release to another using a test cluster. This allows you to catch any configuration changes and update Helm charts before deploying into production:

  1. Review the Helm charts release notes for any relevant changes that affect your override files. For example, there may be new configuration options to add, or deprecated properties to remove.

  2. Review the SEP release notes for new capabilities and breaking changes.

  3. Create a backup of your query logger database. Consult the documentation for the particular database that you use.

  4. Run Helm with the updated Helm chart version and with the updated YAML configuration files, as in the following example:

$ helm upgrade my-sep-staging-cluster starburstdata/starburst-enterprise \
    --install \
    --version 413.18.0 \
    --values ./registry-access.yaml \
    --values ./sep-stage-setup.yaml

Next steps#

  • Customize SEP.

  • Define and configure catalogs.

  • Install and configure a metastore if you use object storage, the Starburst caching service, or data products.

  • [Optional] Install and configure the Starburst cache service.