Starburst Atlas plugin#
The Starburst Atlas plugin for Dell Data Analytics Engine, powered by Starburst Enterprise platform (SEP) allows changes in a SEP cluster’s
catalog, schema, table, or column configuration to be automatically pushed to an
Apache Atlas server by means of an Apache Kafka message bus. The plugin works by publishing change
information about SEP objects to the Kafka topic ATLAS_HOOK
. Any change in
the configuration of Atlas entity names for SEP objects is published to the
topic ATLAS_ENTITIES
.
Note
The plugin requires a valid Starburst Enterprise license.
Configuration#
The Starburst Atlas plugin is implemented as an event listener. To enable the event listener, create a configuration
file on the coordinator with any name, such as
etc/atlas-listener.properties
.
In this file, the event-listener.name
property must be set to
starburst-atlas
. The configuration includes the Kafka broker details and
Atlas service URL, as well as username and password for accessing the Atlas
server.
If your SEP cluster has more than one event listener, identify all listener configuration files in a comma-separated list in
the event-listener.config-files
property of the coordinator’s
config.properties
file. For example:
event-listener.config-files=etc/atlas-listener.properties,etc/http-event-listener.properties
The following is an example of a simple Atlas plugin configuration file:
event-listener.name=starburst-atlas
atlas.cluster.name=fastqueries
atlas.kafka.bootstrap.servers=kafka.example.com:9092
atlas.server.url=https://atlas.example.com:21000
atlas.username=admin
atlas.password=s3cr3t1v3
Atlas plugin configuration properties shows the options for different circumstances.
TLS/HTTPS settings#
All network traffic between the Atlas plugin and the Atlas server uses TLS. If your Atlas server uses a globally trusted certificate and
does not require client certificates, then to connect you only need to specify
the server’s https://
URL with atlas.username
and atlas.password
.
If your Atlas server uses a site-specific certificate, or requires client
certificates, then configure those settings in an XML settings file. Identify
the location of this file with the following property in the coordinator’s
config.properties
file:
atlas.ssl-config-file=etc/atlas-tls-settings.xml
The following shows the template for the TLS settings XML file. As is
standard for TLS, if you provide a globally trusted certificate in the
keystore
setting, there is no need to provide a truststore
path because
the global certificate relies on the Certificate Authorities listed in the
standard Java cacerts
file.
If a Hadoop credential file is required by your Atlas server, specify the path to a JCEKS keystore. This keystore is like a JKS file, but secured with stronger DES encryption.
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.ssl.require.client.cert</name>
<value>true|false</value>
</property>
<property>
<name>ssl.client.keystore.location</name>
<value>Path to KeyStore location</value>
</property>
<property>
<name>ssl.client.truststore.location</name>
<value>Path to TrustStore location</value>
</property>
<property>
<name>hadoop.security.credential.provider.path</name>
<value>jceks://file/Path to Credential File</value>
</property>
</configuration>
Kerberos settings#
If your Atlas server uses Kerberos authentication, specify the following required configuration properties:
atlas.authentication-type=KERBEROS
atlas.kerberos.principal=admin/atlas.cluster@EXAMPLE.COM
atlas.kerberos.keytab=/etc/krb5.keytab
atlas.kerberos.config=/etc/krb5.conf
Pass configuration to Kafka#
Additional properties related to Kafka security as described in the Kafka
documentation can be
passed in a properties file. Specify the path to this file with the
atlas.config.resource
property. For example:
atlas.config.resource=etc/kafka.properties
For example, there are two Kafka properties that can be used to change the default names of the Kafka topics used by this Atlas plugin, as shown in the following table:
Property |
Description |
---|---|
|
Name of the Kafka topic to which the Starburst Atlas plugin publishes SEP
change information. Default is |
|
Name of the Kafka topic to which the Starburst Atlas plugin publishes any
changes in Atlas entity names for SEP objects. Default is
|
Pre-built Atlas hooks#
Certain data systems have an integrated Atlas hook at the metastore level. These
include the following systems that also have a SEP connector: Hive and Kafka. For such catalogs, SEP only needs to
push change lineage details instead of pushing each change. The mapping for
these cases is provided by the following configuration property in the
coordinator’s config.properties
:
atlas.catalog-cluster-mapping=catalogName,AtlasNamespace
The catalogName
refers to one catalog on your SEP cluster.
The AtlasNamespace
refers to a unique qualified name created by Atlas to
categorize entities and types from the same source.
Reference#
The Starburst Atlas plugin configuration uses the following properties:
Property |
Description |
Required |
---|---|---|
|
Must be |
yes |
|
Arbitrary name for this SEP cluster. |
yes |
|
Network name or IP address and port of the Kafka server. |
yes |
|
URL with port of the Atlas server. |
yes |
|
Atlas username. |
yes |
|
Atlas password, if required. |
no |
|
Path to an optional XML configuration file with custom TLS settings. |
no |
|
Set to |
no |
|
Principal name on the Atlas server in standard Kerberos format. |
no |
|
Path to a Kerberos key table file. |
no |
|
Path to a Kerberos config file; typically |
no |
|
Path to a properties file to be passed to the associated Kafka server. |
no |
|
Comma-separated SEP catalog name and Atlas namespace with Atlas hook. |
no |
|
Comma-separated list of clientTags values. Atlas events are not generated for queries by connecting clients that have any of these clientTags. |
no |
The Kafka network name and Atlas URL must be valid and accessible from the SEP coordinator.
Note that the Atlas server does not successfully receive SEP events until SEP types are uploaded to Atlas using the Atlas CLI. Be sure to complete the Atlas setup steps.