Spark Manager (v1)

Download OpenAPI specification:

About

APIs to manage your Spark jobs and clusters

⬇️ Download OpenAPI specification

Uploads

The combined size of an uploaded set of text files, binary files, or secrets cannot exceed 10MB and each individual file or secret cannot exceed 1MB.

Manager Service

Spark management

connectSpark

Start a new Spark Connect driver

Authorizations:
accessToken
Request Body schema: application/json
required
applicationName
string

The application name. If not provided a name will be generated.

object

Any Spark configuration/properties to set arguments

jars
Array of strings

Any jars to pass in the --jars Spark argument

pythonFiles
Array of strings

Any python files to pass in the --py-files Spark argument

files
Array of strings

Any files to pass in the --files Spark argument

archives
Array of strings

Any archives to pass in the --archives Spark argument

object

Any environment variables to set

resourcePool
string

Optional - the resource pool to use (you must have permission to use it)

secretUploads
Array of strings

Optional - secret uploads

Secrets will be set as environment variables in the Spark driver and executors.

Array of objects

Optional - file uploads (read only)

Array of objects

Optional - inline file uploads. See Uploads for more details and limits. (read only)

options
Array of strings
Items Value: "EncryptCommunication"

Details:

  • "EncryptCommunication": Enable encryption for communication between the driver and executors

Responses

Request samples

Content type
application/json
{
  • "applicationName": "string",
  • "sparkProperties": {
    },
  • "jars": [
    ],
  • "pythonFiles": [
    ],
  • "files": [
    ],
  • "archives": [
    ],
  • "environmentVariables": {
    },
  • "resourcePool": "string",
  • "secretUploads": [
    ],
  • "fileUploads": [
    ],
  • "inlineFileUploads": [
    ],
  • "options": [
    ]
}

Response samples

Content type
application/json
{
  • "sparkId": "string",
  • "serverSparkVersion": "string"
}

listEvent

List system events

Authorizations:
accessToken

Responses

Response samples

Content type
application/json
[
  • {
    }
]

batchSpark

Submit and run a batch job

Authorizations:
accessToken
Request Body schema: application/json
required
applicationResource
required
string

The application resource to run - must be on an accessible object store

mainClass
string

The main class of the batch job

cronSchedule
string

Optional CRON schedule. If provided, runs the job on the given schedule.

See Wikipedia's CRON article for details on CRON schedules.

ttlSecondsAfterFinished
integer <int32>

Optional.

Normally, the Spark driver remains after the job completes. ttlSecondsAfterFinished specifies the number of seconds after completion that the driver is eligible to be deleted/cleaned.

applicationArguments
Array of strings

Any application arguments

applicationName
string

The application name. If not provided a name will be generated.

object

Any Spark configuration/properties to set arguments

jars
Array of strings

Any jars to pass in the --jars Spark argument

pythonFiles
Array of strings

Any python files to pass in the --py-files Spark argument

files
Array of strings

Any files to pass in the --files Spark argument

archives
Array of strings

Any archives to pass in the --archives Spark argument

object

Any environment variables to set

resourcePool
string

Optional - the resource pool to use (you must have permission to use it)

secretUploads
Array of strings

Optional - secret uploads

Secrets will be set as environment variables in the Spark driver and executors.

Array of objects

Optional - file uploads (read only)

Array of objects

Optional - inline file uploads. See Uploads for more details and limits. (read only)

options
Array of strings
Items Value: "EncryptCommunication"

Details:

  • "EncryptCommunication": Enable encryption for communication between the driver and executors

Responses

Request samples

Content type
application/json
{
  • "applicationResource": "string",
  • "mainClass": "string",
  • "cronSchedule": "string",
  • "ttlSecondsAfterFinished": 0,
  • "applicationArguments": [
    ],
  • "applicationName": "string",
  • "sparkProperties": {
    },
  • "jars": [
    ],
  • "pythonFiles": [
    ],
  • "files": [
    ],
  • "archives": [
    ],
  • "environmentVariables": {
    },
  • "resourcePool": "string",
  • "secretUploads": [
    ],
  • "fileUploads": [
    ],
  • "inlineFileUploads": [
    ],
  • "options": [
    ]
}

Response samples

Content type
application/json
{
  • "sparkId": "string",
  • "serverSparkVersion": "string"
}

getSparkLogs

Get the log output of a batch job or a connect driver

Authorizations:
accessToken
path Parameters
sparkId
required
string

A spark instance (batch job, connect driver, or cluster)

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

createPresignedUrl

Create pre-signed URLs for the given bucket, key and credentials

Authorizations:
accessToken
Request Body schema: application/json
required
accessKey
required
string

The proxy AccessKey provided by your administrator

secretKey
required
string

The proxy SecretKey provided by your administrator

region
required
string

The S3 region of the bucket

bucket
required
string

The bucket for creating the pre-signed URLs

key
required
string

The key for creating the pre-signed URLs

Responses

Request samples

Content type
application/json
{
  • "accessKey": "string",
  • "secretKey": "string",
  • "region": "string",
  • "bucket": "string",
  • "key": "string"
}

Response samples

Content type
application/json
{
  • "presignedUrls": {
    }
}

getSparkLogsWithIndex

Get the log output of a batch job or a connect driver. logsId is the log index to return: Some cluster types have multiple nodes/workers. Pass 0 to get the main logs and then increase the index to get other logs.

Authorizations:
accessToken
path Parameters
sparkId
required
string

A spark instance (batch job, connect driver, or cluster)

logsId
required
string

Logs from a Spark instance

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

getSparkStatus

Get the status of a batch job or a connect driver

Authorizations:
accessToken
path Parameters
sparkId
required
string

A spark instance (batch job, connect driver, or cluster)

Responses

Response samples

Content type
application/json
{
  • "status": {
    }
}

deleteSpark

Shutdown and remove a Spark instance

Authorizations:
accessToken
path Parameters
sparkId
required
string

A spark instance (batch job, connect driver, or cluster)

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

getLogs

Get the system logs

Authorizations:
accessToken
path Parameters
logsId
required
string

Logs from a Spark instance

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

updateUserSpark

Add the given user to the list of users allowed to access the given instance. You must be an admin or owner of the instance to perform this operation.

Authorizations:
accessToken
path Parameters
sparkId
required
string

A spark instance (batch job, connect driver, or cluster)

userId
required
string

A user

Responses

Response samples

Content type
application/json
{
  • "resourceName": "string",
  • "description": "string"
}

deleteUserSpark

Remove the given user from the list of users allowed to access the given instance. You must be an admin or owner of the instance to perform this operation.

Authorizations:
accessToken
path Parameters
sparkId
required
string

A spark instance (batch job, connect driver, or cluster)

userId
required
string

A user

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

allSparkLogs

Get all the logs of a batch job or a connect driver (driver and any executors) as a single Zip file

Authorizations:
accessToken
path Parameters
sparkId
required
string

A spark instance (batch job, connect driver, or cluster)

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

listSpark

List all known instances

Authorizations:
accessToken

Responses

Response samples

Content type
application/json
[
  • {
    }
]

Resources Service

Resource pool management

getResourcePool

Return the current resource pool set including the total available memory and cores

Authorizations:
accessToken

Responses

Response samples

Content type
application/json
{
  • "totalMemory": "string",
  • "totalCores": 0,
  • "resourcePools": [
    ]
}

updateResourcePool

Update the set of available resource pools

Authorizations:
accessToken
Request Body schema: application/json
required
required
Array of objects (ResourcePools)

The set of resource pools

Responses

Request samples

Content type
application/json
{
  • "resourcePools": [
    ]
}

Response samples

Content type
application/json
{
  • "resourceName": "string",
  • "description": "string"
}

listResourcePoolAssignment

List of resource pool user assignments

Authorizations:
accessToken

Responses

Response samples

Content type
application/json
[
  • {
    }
]

updateResourcePoolAssignment

Replace the resource pool user assignments

Authorizations:
accessToken
Request Body schema: application/json
required
Array
resourcePoolId
required
string

The resource pool name

userIds
required
Array of strings

Set of users with this role

Responses

Request samples

Content type
application/json
[
  • {
    }
]

Response samples

Content type
application/json
{
  • "resourceName": "string",
  • "description": "string"
}

Admin Service

Administration

listRoleAssignment

List user to role assignments

Authorizations:
accessToken

Responses

Response samples

Content type
application/json
[
  • {
    }
]

updateRoleAssignment

Replace the entire set of user to role assignments

Authorizations:
accessToken
Request Body schema: application/json
required
Array
roleId
required
string

Role name

userIds
required
Array of strings

Set of users with this role

Responses

Request samples

Content type
application/json
[
  • {
    }
]

Response samples

Content type
application/json
{
  • "resourceName": "string",
  • "description": "string"
}

updateRoleUser

Assign a user to a rule

Authorizations:
accessToken
path Parameters
userId
required
string

A user

roleId
required
string

A role

Responses

Response samples

Content type
application/json
{
  • "resourceName": "string",
  • "description": "string"
}

deleteRoleUser

Remove a user from a rule

Authorizations:
accessToken
path Parameters
userId
required
string

A user

roleId
required
string

A role

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

Uploads Service

File and secret upload management

listSecretUploads

List file uploads

Authorizations:
accessToken

Responses

Response samples

Content type
application/json
[
  • {
    }
]

createSecretUpload

Create a new secret upload

Authorizations:
accessToken
Request Body schema: application/json
required
comment
required
string

Comment or description. Used only for your own reference purposes.

required
object

Map of name-to-binary secrets. Data must be Base64 encoded.

When the uploaded secret is used in a Spark Connect, batch job, etc. this map of secrets/values are set as environment variables. Thus, the secret name must be valid environment variable identifier.

See Uploads for more details and limits.

Responses

Request samples

Content type
application/json
{
  • "comment": "string",
  • "secrets": {
    }
}

Response samples

Content type
application/json
{
  • "uploadId": "string",
  • "comment": "string",
  • "secretNames": [
    ]
}

listFileUploads

List file uploads

Authorizations:
accessToken

Responses

Response samples

Content type
application/json
[
  • {
    }
]

createFileUpload

Create a new file upload

Authorizations:
accessToken
Request Body schema: application/json
required
comment
required
string

Comment or description. Used only for your own reference purposes.

required
object

Map of name-to-text files/data

required
object

Map of name-to-binary files/data. Data must be Base64 encoded.

Responses

Request samples

Content type
application/json
{
  • "comment": "string",
  • "textData": {
    },
  • "binaryData": {
    }
}

Response samples

Content type
application/json
{
  • "uploadId": "string",
  • "comment": "string",
  • "textNames": [
    ],
  • "binaryNames": [
    ]
}

getSecretUpload

Get a secret upload

Authorizations:
accessToken
path Parameters
uploadId
required
string

A text or binary file. See Uploads for more details and limits.

Responses

Response samples

Content type
application/json
{
  • "comment": "string",
  • "secretNames": [
    ]
}

updateSecretUpload

Update a secret upload

Authorizations:
accessToken
path Parameters
uploadId
required
string

A text or binary file. See Uploads for more details and limits.

Request Body schema: application/json
required
comment
required
string

Comment or description. Used only for your own reference purposes.

required
object

Map of name-to-binary secrets. Data must be Base64 encoded.

When the uploaded secret is used in a Spark Connect, batch job, etc. this map of secrets/values are set as environment variables. Thus, the secret name must be valid environment variable identifier.

See Uploads for more details and limits.

Responses

Request samples

Content type
application/json
{
  • "comment": "string",
  • "secrets": {
    }
}

Response samples

Content type
application/json
{
  • "resourceName": "string",
  • "description": "string"
}

deleteSecretUpload

Delete a secret upload

Authorizations:
accessToken
path Parameters
uploadId
required
string

A text or binary file. See Uploads for more details and limits.

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

getFileUpload

Get a file upload

Authorizations:
accessToken
path Parameters
uploadId
required
string

A text or binary file. See Uploads for more details and limits.

Responses

Response samples

Content type
application/json
{
  • "comment": "string",
  • "textData": {
    },
  • "binaryData": {
    }
}

updateFileUpload

Update a file upload

Authorizations:
accessToken
path Parameters
uploadId
required
string

A text or binary file. See Uploads for more details and limits.

Request Body schema: application/json
required
comment
required
string

Comment or description. Used only for your own reference purposes.

required
object

Map of name-to-text files/data

required
object

Map of name-to-binary files/data. Data must be Base64 encoded.

Responses

Request samples

Content type
application/json
{
  • "comment": "string",
  • "textData": {
    },
  • "binaryData": {
    }
}

Response samples

Content type
application/json
{
  • "resourceName": "string",
  • "description": "string"
}

deleteFileUpload

Delete a file upload

Authorizations:
accessToken
path Parameters
uploadId
required
string

A text or binary file. See Uploads for more details and limits.

Responses

Response samples

Content type
application/json
{
  • "fieldViolations": [
    ]
}

Model Definitions

RedactedSecretUpload

comment
required
string

Comment or description. Used only for your own reference purposes.

secretNames
required
Array of strings

Secret names

{
  • "comment": "string",
  • "secretNames": [
    ]
}

SparkBatchJob

applicationResource
required
string

The application resource to run - must be on an accessible object store

mainClass
string

The main class of the batch job

cronSchedule
string

Optional CRON schedule. If provided, runs the job on the given schedule.

See Wikipedia's CRON article for details on CRON schedules.

ttlSecondsAfterFinished
integer <int32>

Optional.

Normally, the Spark driver remains after the job completes. ttlSecondsAfterFinished specifies the number of seconds after completion that the driver is eligible to be deleted/cleaned.

applicationArguments
Array of strings

Any application arguments

applicationName
string

The application name. If not provided a name will be generated.

object

Any Spark configuration/properties to set arguments

jars
Array of strings

Any jars to pass in the --jars Spark argument

pythonFiles
Array of strings

Any python files to pass in the --py-files Spark argument

files
Array of strings

Any files to pass in the --files Spark argument

archives
Array of strings

Any archives to pass in the --archives Spark argument

object

Any environment variables to set

resourcePool
string

Optional - the resource pool to use (you must have permission to use it)

secretUploads
Array of strings

Optional - secret uploads

Secrets will be set as environment variables in the Spark driver and executors.

Array of objects

Optional - file uploads (read only)

Array of objects

Optional - inline file uploads. See Uploads for more details and limits. (read only)

options
Array of strings
Items Value: "EncryptCommunication"

Details:

  • "EncryptCommunication": Enable encryption for communication between the driver and executors
{
  • "applicationResource": "string",
  • "mainClass": "string",
  • "cronSchedule": "string",
  • "ttlSecondsAfterFinished": 0,
  • "applicationArguments": [
    ],
  • "applicationName": "string",
  • "sparkProperties": {
    },
  • "jars": [
    ],
  • "pythonFiles": [
    ],
  • "files": [
    ],
  • "archives": [
    ],
  • "environmentVariables": {
    },
  • "resourcePool": "string",
  • "secretUploads": [
    ],
  • "fileUploads": [
    ],
  • "inlineFileUploads": [
    ],
  • "options": [
    ]
}

PresignedUrl

required
object

The pre-signed URLs. The key is an HTTP verb (GET, PUT, POST, DELETE). The value is the pre-signed URL

{
  • "presignedUrls": {
    }
}

SecretUpload

comment
required
string

Comment or description. Used only for your own reference purposes.

required
object

Map of name-to-binary secrets. Data must be Base64 encoded.

When the uploaded secret is used in a Spark Connect, batch job, etc. this map of secrets/values are set as environment variables. Thus, the secret name must be valid environment variable identifier.

See Uploads for more details and limits.

{
  • "comment": "string",
  • "secrets": {
    }
}

ResourcePoolSet

required
Array of objects (ResourcePools)

The set of resource pools

{
  • "resourcePools": [
    ]
}

Upload

comment
required
string

Comment or description. Used only for your own reference purposes.

required
object

Map of name-to-text files/data

required
object

Map of name-to-binary files/data. Data must be Base64 encoded.

{
  • "comment": "string",
  • "textData": {
    },
  • "binaryData": {
    }
}

SparkConnectDriver

applicationName
string

The application name. If not provided a name will be generated.

object

Any Spark configuration/properties to set arguments

jars
Array of strings

Any jars to pass in the --jars Spark argument

pythonFiles
Array of strings

Any python files to pass in the --py-files Spark argument

files
Array of strings

Any files to pass in the --files Spark argument

archives
Array of strings

Any archives to pass in the --archives Spark argument

object

Any environment variables to set

resourcePool
string

Optional - the resource pool to use (you must have permission to use it)

secretUploads
Array of strings

Optional - secret uploads

Secrets will be set as environment variables in the Spark driver and executors.

Array of objects

Optional - file uploads (read only)

Array of objects

Optional - inline file uploads. See Uploads for more details and limits. (read only)

options
Array of strings
Items Value: "EncryptCommunication"

Details:

  • "EncryptCommunication": Enable encryption for communication between the driver and executors
{
  • "applicationName": "string",
  • "sparkProperties": {
    },
  • "jars": [
    ],
  • "pythonFiles": [
    ],
  • "files": [
    ],
  • "archives": [
    ],
  • "environmentVariables": {
    },
  • "resourcePool": "string",
  • "secretUploads": [
    ],
  • "fileUploads": [
    ],
  • "inlineFileUploads": [
    ],
  • "options": [
    ]
}

PresignedUrlRequest

accessKey
required
string

The proxy AccessKey provided by your administrator

secretKey
required
string

The proxy SecretKey provided by your administrator

region
required
string

The S3 region of the bucket

bucket
required
string

The bucket for creating the pre-signed URLs

key
required
string

The key for creating the pre-signed URLs

{
  • "accessKey": "string",
  • "secretKey": "string",
  • "region": "string",
  • "bucket": "string",
  • "key": "string"
}

RoleAssignment

roleId
required
string

Role name

userIds
required
Array of strings

Set of users with this role

{
  • "roleId": "string",
  • "userIds": [
    ]
}

SparkBatchInstance

sparkId
required
string

The instance Id

serverSparkVersion
required
string

The Spark version used

{
  • "sparkId": "string",
  • "serverSparkVersion": "string"
}

ResourcePools

resourcePoolId
required
string

The name of this resource pool (must be unique)

priority
integer <int32>

The priority of this pool. Pools with larger/higher priority numbers have priority over pools with smaller/lower priority numbers. If not specified, the priority is "0".

maxApplications
integer <int32>

Maximum active applications for this pool

minMemory
required
string

Minimum memory (as a Spark quantity string)

minCores
required
integer <int32>

Minimum virtual cores

maxMemory
required
string

Maximum memory (as a Spark quantity string)

maxCores
required
integer <int32>

Maximum virtual cores

defaultMemoryPerJob
required
string

Default memory (as a Spark quantity string) per job submitted to the resource pool

defaultCoresPerJob
required
integer <int32>

Default virtual cores per job submitted to the resource pool

defaultExecutorsPerJob
required
integer <int32>

Default executors per job submitted to the resource pool

maxMemoryPerJob
required
string

Maximum memory (as a Spark quantity string) per job submitted to the resource pool

maxCoresPerJob
required
integer <int32>

Maximum virtual cores per job submitted to the resource pool

maxExecutorsPerJob
required
integer <int32>

Maximum executors per job submitted to the resource pool

{
  • "resourcePoolId": "string",
  • "priority": 0,
  • "maxApplications": 0,
  • "minMemory": "string",
  • "minCores": 0,
  • "maxMemory": "string",
  • "maxCores": 0,
  • "defaultMemoryPerJob": "string",
  • "defaultCoresPerJob": 0,
  • "defaultExecutorsPerJob": 0,
  • "maxMemoryPerJob": "string",
  • "maxCoresPerJob": 0,
  • "maxExecutorsPerJob": 0
}

UploadInfo

uploadId
required
string

The ID of this upload

comment
required
string

Comment or description. Used only for your own reference purposes.

textNames
required
Array of strings

Names of text data in the upload

binaryNames
required
Array of strings

Names of binary data in the upload

{
  • "uploadId": "string",
  • "comment": "string",
  • "textNames": [
    ],
  • "binaryNames": [
    ]
}

SparkInstanceInfo

sparkId
required
string

The instance Id

type
required
string

The instance type

createdBy
required
string

User that created the instance

required
object

Any additional details about the instance

{
  • "sparkId": "string",
  • "type": "string",
  • "createdBy": "string",
  • "details": {
    }
}

SecretUploadInfo

uploadId
required
string

The ID of this upload

comment
required
string

Comment or description. Used only for your own reference purposes.

secretNames
required
Array of strings

Secret names

{
  • "uploadId": "string",
  • "comment": "string",
  • "secretNames": [
    ]
}

Status

required
object

The status

{
  • "status": {
    }
}

Event

time
required
string

Time of the event

type
required
string

Event type

reason
required
string

Event reason

name
required
string

Event name

action
required
string

Event action

{
  • "time": "string",
  • "type": "string",
  • "reason": "string",
  • "name": "string",
  • "action": "string"
}

ResourcePoolAssignment

resourcePoolId
required
string

The resource pool name

userIds
required
Array of strings

Set of users with this role

{
  • "resourcePoolId": "string",
  • "userIds": [
    ]
}

SparkInstance

sparkId
required
string

The instance Id

serverSparkVersion
required
string

The Spark version used

{
  • "sparkId": "string",
  • "serverSparkVersion": "string"
}

ResourcePoolsInfo

totalMemory
required
string

Total available memory (as a Spark quantity string)

totalCores
required
integer <int32>

Total available virtual cores

required
Array of objects

The set of resource pools

{
  • "totalMemory": "string",
  • "totalCores": 0,
  • "resourcePools": [
    ]
}

Responses

Errorinfo

reason
required
string

Error reason/detail (read only)

required
object

Any additional details (read only)

{
  • "reason": "string",
  • "metadata": {
    }
}

Badrequest

required
Array of objects (FieldViolations)

Field violations (read only)

{
  • "fieldViolations": [
    ]
}

Resourceinfo

resourceName
required
string

Name of the resource (read only)

description
required
string

Violation description (read only)

{
  • "resourceName": "string",
  • "description": "string"
}

Quotafailure

required
Array of objects (FieldViolations)

Field violations (read only)

{
  • "fieldViolations": [
    ]
}

FieldViolations

field
required
string

Field name

description
required
string

Description of the violation

{
  • "field": "string",
  • "description": "string"
}