Advanced Usage

This document covers some of the SDK more advanced features.

Authenticate using an Azure AD token

>>> from azure_databricks_sdk_python import Client, AuthMethods
>>> client = Client(auth_method=AuthMethods.AZURE_AD_USER,
                    databricks_instance="<instance>", access_token="<ad_token>")

Authenticate using a service principal

>>> from azure_databricks_sdk_python import Client, AuthMethods
>>> Client(auth_method=AuthMethods.AZURE_AD_SERVICE_PRINCIPAL,
           databricks_instance="<instance>", access_token="<access_token>",
           management_token="<management_token>", resource_id="<resource_id>")

Note

You can generate a management token using curl for example:
curl -X GET -H 'Content-Type: application/x-www-form-urlencoded' \ -d 'grant_type=client_credentials&client_id=<client-id>&resource=https://management.core.windows.net/&client_secret=<client-secret>' \ https://login.microsoftonline.com//oauth2/token

Note

You can generate an token using curl for example:
curl -X GET -H 'Content-Type: application/x-www-form-urlencoded' \ -d 'grant_type=client_credentials&client_id=<client-id>&resource=2ff814a6-3304-4ab8-85cb-cd0e6f879c1d&client_secret=<client-secret>' \ https://login.microsoftonline.com//oauth2/token

Available operations on clusters

>>> client.clusters.list()
list(self)

Return information about all pinned clusters, active clusters, up to 70 of the most recently terminated all-purpose clusters in the past 30 days, and up to 30 of the most recently terminated job clusters in the past 30 days.

Returns:
[ClusterInfo]: A list of clusters.

>>> client.clusters.list_node_types()
list_node_types(self)

Return a list of supported Spark node types. These node types can be used to launch a cluster.

Returns:
[NodeType]: The list of available Spark node types.

>>> client.clusters.spark_versions()
spark_versions(self)

Return the list of available runtime versions. These versions can be used to launch a cluster.

Returns:
[SparkVersion]: All the available runtime versions.

>>> client.clusters.get(...)
get(self, cluster_id)

Retrieve the information for a cluster given its identifier. Clusters can be described while they are running or up to 30 days after they are terminated.

Args:
cluster_id (str):The cluster about which to retrieve information. This field is required.
Returns:
ClusterInfo: Metadata about a cluster.

>>> client.clusters.events(...)
events(self, req: azure_databricks_sdk_python.types.clusters.ClusterEventRequest, force: bool = False)

Retrieve a list of events about the activity of a cluster.

Args:
req (ClusterEventRequest): Cluster event request structure. This field is required. force (bool): If false, it will check that req is a dict then pass it as is, with no type validation.
Returns:
ClusterEventResponse: Cluster event request response structure.

>>> client.clusters.pin(...)
pin(self, cluster_id)

Ensure that an all-purpose cluster configuration is retained even after a cluster has been terminated for more than 30 days. Pinning ensures that the cluster is always returned by the List API. Pinning a cluster that is already pinned has no effect.

Args:
cluster_id (str):The cluster to pin. This field is required.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.unpin(...)
unpin(self, cluster_id)

Allows the cluster to eventually be removed from the list returned by the List API. Unpinning a cluster that is not pinned has no effect.

Args:
cluster_id (str):The cluster to pin. This field is required.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.delete(...)
delete(self, cluster_id)

Terminate a cluster given its ID.

Args:
cluster_id (str): The cluster to be terminated. This field is required.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.permanent_delete(...)
permanent_delete(self, cluster_id)

Permanently delete a cluster.

Args:
cluster_id (str): The cluster to be permanently deleted. This field is required.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.resize(...)
resize(self, req: azure_databricks_sdk_python.types.clusters.ClusterResizeRequest, force: bool = False)

Resize a cluster to have a desired number of workers. The cluster must be in the RUNNING state.

Args:
req (ClusterResizeRequest): Cluster resize request structure. This field is required. force (bool): If false, it will check that req is a dict then pass it as is, with no type validation.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.restart(...)
restart(self, cluster_id)
Restart a cluster given its ID.
The cluster must be in the RUNNING state.
Args:
cluster_id (str): The cluster to be started. This field is required.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.start(...)
start(self, cluster_id)

Start a terminated cluster given its ID.

Args:
cluster_id (str): The cluster to be started. This field is required.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.create(...)
create(self, req: azure_databricks_sdk_python.types.clusters.ClusterAttributes, force: bool = False)

Create a new Apache Spark cluster. This method acquires new instances from the cloud provider if necessary.

Args:
req (ClusterAttributes): Common set of attributes set during cluster creation. This field is required. force (bool): If false, it will check that req is a dict then pass it as is, with no type validation.
Returns:
ClusterId: in case of success or will raise an exception.

>>> client.clusters.edit(...)
edit(self, req: azure_databricks_sdk_python.types.clusters.ClusterAttributes, force: bool = False)

Edit the configuration of a cluster to match the provided attributes and size.

Args
req (ClusterAttributes): Common set of attributes set during cluster creation. This field is required. force (bool): If false, it will check that req is a dict then pass it as is, with no type validation.
Returns:
ClusterId: in case of success or will raise an exception.

Available operations on tokens

list(self)

List all the valid tokens for a user-workspace pair.

Returns:
[PublicTokenInfo]: A list of token information for a user-workspace pair.

create(self, comment: str = None, lifetime_seconds: int = 7776000)

Create and return a token.

Args:
comment (str, optional): Optional description to attach to the token. Defaults to None. lifetime_seconds (int, optional): The lifetime of the token, in seconds. If no lifetime is specified, the token remains valid indefinitely. Defaults to 7776000 (90j).
Returns:
dict: contains token_value and token_info as a PublicTokenInfo.

delete(self, token_id: str)

Revoke an access token.

Args:
token_id (str): The ID of the token to be revoked.
Returns:
TokenId: in case of success or will raise an exception.

Available operations on secrets

list(self, scope: str)

List the secret keys that are stored at this scope. This is a metadata-only operation; you cannot retrieve secret data using this API. You must have READ permission to make this call.

Args:
scope (str): The name of the scope whose secrets you want to list. This field is required.
Returns:
List[SecretMetadata]: Metadata information of all secrets contained within the given scope.

put(self, scope: str, key: str, string_value: str = None, bytes_value: bytes = None)

Create or modify a secret from a Databricks-backed scope.

Args:
scope (str): The name of the scope to which the secret will be associated with. This field is required. key (str): A unique name to identify the secret. This field is required. string_value (str, optional): this value will be stored in UTF-8 (MB4) form. Defaults to None. bytes_value (bytes, optional): this value will be stored as bytes. Defaults to None.
Returns:
bool: True

delete(self, scope: str, key: str)

Deletes a secret from a Databricks-backed scope.

Args:
scope (str): The name of the scope that contains the secret to delete. This field is required. key (str): Name of the secret to delete. This field is required.
Returns:
bool: True

list(self, scope: str)

List the ACLs set on the given scope.

Args:
scope (str): The name of the scope to fetch ACL information from. This field is required.
Returns:
List[AclItem]: The associated ACLs rule applied to principals in the given scope.

get(self, scope: str, principal: str)

Describe the details about the given ACL, such as the group and permission.

Args:
scope (str): The name of the scope to fetch ACL information from. This field is required. principal (str): The principal to fetch ACL information for. This field is required.
Returns:
AclItem: An item representing an ACL rule.

put(self, scope: str, principal: str, permission: azure_databricks_sdk_python.types.secrets.AclPermission)

Create or overwrite the ACL associated with the given principal (user or group) on the specified scope point.

Args:
scope (str): The name of the scope to apply permissions to. This field is required. principal (str): The principal to which the permission is applied. This field is required. permission (AclPermission): The permission level applied to the principal. This field is required.
Returns:
bool: True

delete(self, scope: str, principal: str)

Delete the given ACL on the given scope.

Args:
scope (str): The name of the scope to remove permissions from. This field is required. principal (str): The principal to remove an existing ACL from. This field is required.
Returns:
bool: True

list(self)

List all secret scopes available in the workspace.

Returns:
[SecretScope]: The available secret scopes.

create(self, scope: str, initial_manage_principal: str = None)

Create a Databricks-backed secret scope in which secrets are stored in Databricks-managed storage and encrypted with a cloud-based specific encryption key.

Args:
scope (str): Scope name requested by the user. Scope names are unique. This field is required. initial_manage_principal (str, optional): The principal that is initially granted MANAGE permission to the created scope. Defaults to None.
Returns:
bool: True

delete(self, scope: str)

Delete a secret scope.

Args:
scope (str): Name of the scope to delete. This field is required.
Returns:
bool: True