DatabricksCompute Class

Manages a Databricks compute target in Azure Machine Learning.

Azure Databricks is an Apache Spark-based environment in the Azure cloud. It can be used as a compute target with an Azure Machine Learning pipeline. For more information, see What are compute targets in Azure Machine Learning?

Class ComputeTarget constructor.

Retrieve a cloud representation of a Compute object associated with the provided workspace. Returns an instance of a child class corresponding to the specific type of the retrieved Compute object.

Inheritance
DatabricksCompute

Constructor

DatabricksCompute(workspace, name)

Parameters

Name Description
workspace
Required

The workspace object containing the DatabricksCompute object to retrieve.

name
Required
str

The name of the of the DatabricksCompute object to retrieve.

workspace
Required

The workspace object containing the Compute object to retrieve.

name
Required
str

The name of the of the Compute object to retrieve.

Remarks

The following example shows how to attach Azure Databricks as a compute target.


   # Replace with your account info before running.

   db_compute_name=os.getenv("DATABRICKS_COMPUTE_NAME", "<my-databricks-compute-name>") # Databricks compute name
   db_resource_group=os.getenv("DATABRICKS_RESOURCE_GROUP", "<my-db-resource-group>") # Databricks resource group
   db_workspace_name=os.getenv("DATABRICKS_WORKSPACE_NAME", "<my-db-workspace-name>") # Databricks workspace name
   db_access_token=os.getenv("DATABRICKS_ACCESS_TOKEN", "<my-access-token>") # Databricks access token

   try:
       databricks_compute = DatabricksCompute(workspace=ws, name=db_compute_name)
       print('Compute target {} already exists'.format(db_compute_name))
   except ComputeTargetException:
       print('Compute not found, will use below parameters to attach new one')
       print('db_compute_name {}'.format(db_compute_name))
       print('db_resource_group {}'.format(db_resource_group))
       print('db_workspace_name {}'.format(db_workspace_name))
       print('db_access_token {}'.format(db_access_token))

       config = DatabricksCompute.attach_configuration(
           resource_group = db_resource_group,
           workspace_name = db_workspace_name,
           access_token= db_access_token)
       databricks_compute=ComputeTarget.attach(ws, db_compute_name, config)
       databricks_compute.wait_for_completion(True)

Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb

Methods

attach

DEPRECATED. Use the attach_configuration method instead.

Associate an existing Databricks compute resource with the provided workspace.

attach_configuration

Create a configuration object for attaching a Databricks compute target.

delete

Delete is not supported for a DatabricksCompute object. Use detach instead.

deserialize

Convert a JSON object into a DatabricksCompute object.

detach

Detaches the Databricks object from its associated workspace.

Underlying cloud objects are not deleted, only the association is removed.

get_credentials

Retrieve the credentials for the Databricks target.

refresh_state

Perform an in-place update of the properties of the object.

This method updates the properties based on the current state of the corresponding cloud object. This is primarily used for manual polling of compute state.

serialize

Convert this DatabricksCompute object into a JSON serialized dictionary.

attach

DEPRECATED. Use the attach_configuration method instead.

Associate an existing Databricks compute resource with the provided workspace.

static attach(workspace, name, resource_id, access_token)

Parameters

Name Description
workspace
Required

The workspace object to associate the compute resource with.

name
Required
str

The name to associate with the compute resource inside the provided workspace. Does not have to match the name of the compute resource to be attached.

resource_id
Required
str

The Azure resource ID for the compute resource being attached.

access_token
Required
str

The access token for the resource being attached.

Returns

Type Description

A DatabricksCompute object representation of the compute object.

Exceptions

Type Description

attach_configuration

Create a configuration object for attaching a Databricks compute target.

static attach_configuration(resource_group=None, workspace_name=None, resource_id=None, access_token='')

Parameters

Name Description
resource_group
str

The name of the resource group in which the Databricks is located.

Default value: None
workspace_name
str

The Databricks workspace name.

Default value: None
resource_id
str

The Azure resource ID for the compute resource being attached.

Default value: None
access_token
Required
str

The access token for the compute resource being attached.

Returns

Type Description

A configuration object to be used when attaching a Compute object.

Exceptions

Type Description

delete

Delete is not supported for a DatabricksCompute object. Use detach instead.

delete()

Exceptions

Type Description

deserialize

Convert a JSON object into a DatabricksCompute object.

static deserialize(workspace, object_dict)

Parameters

Name Description
workspace
Required

The workspace object the DatabricksCompute object is associated with.

object_dict
Required

A JSON object to convert to a DatabricksCompute object.

Returns

Type Description

The DatabricksCompute representation of the provided JSON object.

Exceptions

Type Description

Remarks

Raises a ComputeTargetException if the provided workspace is not the workspace the Compute is associated with.

detach

Detaches the Databricks object from its associated workspace.

Underlying cloud objects are not deleted, only the association is removed.

detach()

Exceptions

Type Description

get_credentials

Retrieve the credentials for the Databricks target.

get_credentials()

Returns

Type Description

The credentials for the Databricks target.

Exceptions

Type Description

refresh_state

Perform an in-place update of the properties of the object.

This method updates the properties based on the current state of the corresponding cloud object. This is primarily used for manual polling of compute state.

refresh_state()

Exceptions

Type Description

serialize

Convert this DatabricksCompute object into a JSON serialized dictionary.

serialize()

Returns

Type Description

The JSON representation of this DatabricksCompute object.

Exceptions

Type Description