Databricks Asset Bundles resources
Databricks Asset Bundles allows you to specify information about the Azure Databricks resources used by the bundle in the resources
mapping in the bundle configuration. See resources mapping.
This article outlines supported resource types for bundles and provides details and examples for each supported type.
Supported resources
The following table lists supported resource types for bundles. Some resources can be created by defining them in a bundle and deploying the bundle, and some resources only support referencing an existing resource to include in the bundle.
Resources are defined using the corresponding Databricks REST API object’s create operation request payload, where the object’s supported fields, expressed as YAML, are the resource’s supported properties. Links to documentation for each resource’s corresponding payloads are listed in the table.
Tip
The databricks bundle validate
command returns warnings if unknown resource properties are found in bundle configuration files.
Resource | Create support | Resource properties |
---|---|---|
cluster | ✓ | Cluster properties: POST /api/2.1/clusters/create |
dashboard | Dashboard properties: POST /api/2.0/lakeview/dashboards | |
experiment | ✓ | Experiment properties: POST /api/2.0/mlflow/experiments/create |
job | ✓ | Job properties: POST /api/2.1/jobs/create |
model (legacy) | ✓ | Model properties: POST /api/2.0/mlflow/registered-models/create |
model_serving_endpoint | ✓ | Model serving endpoint properties: POST /api/2.0/serving-endpoints |
pipeline | ✓ | Pipeline properties: POST /api/2.0/pipelines |
quality_monitor | ✓ | Quality monitor properties: POST /api/2.1/unity-catalog/tables/{table_name}/monitor |
registered_model (Unity Catalog) | ✓ | Unity Catalog model properties: POST /api/2.1/unity-catalog/models |
schema (Unity Catalog) | ✓ | Unity Catalog schema properties: POST /api/2.1/unity-catalog/schemas |
volume (Unity Catalog) | ✓ | Unity Catalog volume properties: POST /api/2.1/unity-catalog/volumes |
cluster
The cluster resource allows you to create all-purpose clusters. The following example creates a cluster named my_cluster
and sets that as the cluster to use to run the notebook in my_job
:
bundle:
name: clusters
resources:
clusters:
my_cluster:
num_workers: 2
node_type_id: "i3.xlarge"
autoscale:
min_workers: 2
max_workers: 7
spark_version: "13.3.x-scala2.12"
spark_conf:
"spark.executor.memory": "2g"
jobs:
my_job:
tasks:
- task_key: test_task
notebook_task:
notebook_path: "./src/my_notebook.py"
dashboard
The dashboard resource allows you to manage AI/BI dashboards in a bundle. For information about AI/BI dashboards, see Dashboards.
The following example includes and deploys the sample NYC Taxi Trip Analysis dashboard to the Databricks workspace.
resources:
dashboards:
nyc_taxi_trip_analysis:
display_name: "NYC Taxi Trip Analysis"
file_path: ../src/nyc_taxi_trip_analysis.lvdash.json
warehouse_id: ${var.warehouse_id}
If you use the UI to modify the dashboard, modifications made through the UI are not applied to the dashboard JSON file in the local bundle unless you explicitly update it using bundle generate
. You can use the --watch
option to continuously poll and retrieve changes to the dashboard. See Generate a bundle configuration file.
In addition, if you attempt to deploy a bundle that contains a dashboard JSON file that is different than the one in the remote workspace, an error will occur. To force the deploy and overwrite the dashboard in the remote workspace with the local one, use the --force
option. See Deploy a bundle.
experiment
The experiment resource allows you to define MLflow experiments in a bundle. For information about MLflow experiments, see MLflow experiments.
The following example defines an experiment that all users can view:
resources:
experiments:
experiment:
name: my_ml_experiment
permissions:
- level: CAN_READ
group_name: users
description: MLflow experiment used to track runs
job
The job resource allows you to define jobs and their corresponding tasks in your bundle. For information about jobs, see Schedule and orchestrate workflows. For a tutorial that uses a Databricks Asset Bundles template to create a job, see Develop a job on Azure Databricks using Databricks Asset Bundles.
The following example defines a job with the resource key hello-job
with one notebook task:
resources:
jobs:
hello-job:
name: hello-job
tasks:
- task_key: hello-task
notebook_task:
notebook_path: ./hello.py
For information about defining job tasks and overriding job settings, see Add tasks to jobs in Databricks Asset Bundles, Override job tasks settings in Databricks Asset Bundles, and Override cluster settings in Databricks Asset Bundles.
model_serving_endpoint
The model_serving_endpoint resource allows you to define model serving endpoints. See Manage model serving endpoints.
The following example defines an Unity Catalog model serving endpoint:
resources:
model_serving_endpoints:
uc_model_serving_endpoint:
name: "uc-model-endpoint"
config:
served_entities:
- entity_name: "myCatalog.mySchema.my-ads-model"
entity_version: "10"
workload_size: "Small"
scale_to_zero_enabled: "true"
traffic_config:
routes:
- served_model_name: "my-ads-model-10"
traffic_percentage: "100"
tags:
- key: "team"
value: "data science"
quality_monitor (Unity Catalog)
The quality_monitor resource allows you to define a Unity Catalog table monitor. For information about monitors, see Monitor model quality and endpoint health.
The following example defines a quality monitor:
resources:
quality_monitors:
my_quality_monitor:
table_name: dev.mlops_schema.predictions
output_schema_name: ${bundle.target}.mlops_schema
assets_dir: /Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
inference_log:
granularities: [1 day]
model_id_col: model_id
prediction_col: prediction
label_col: price
problem_type: PROBLEM_TYPE_REGRESSION
timestamp_col: timestamp
schedule:
quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
timezone_id: UTC
registered_model (Unity Catalog)
The registered model resource allows you to define models in Unity Catalog. For information about Unity Catalog registered models, see Manage model lifecycle in Unity Catalog.
The following example defines a registered model in Unity Catalog:
resources:
registered_models:
model:
name: my_model
catalog_name: ${bundle.target}
schema_name: mlops_schema
comment: Registered model in Unity Catalog for ${bundle.target} deployment target
grants:
- privileges:
- EXECUTE
principal: account users
pipeline
The pipeline resource allows you to create Delta Live Tables pipelines. For information about pipelines, see What is Delta Live Tables?. For a tutorial that uses the Databricks Asset Bundles template to create a pipeline, see Develop Delta Live Tables pipelines with Databricks Asset Bundles.
The following example defines a pipeline with the resource key hello-pipeline
:
resources:
pipelines:
hello-pipeline:
name: hello-pipeline
clusters:
- label: default
num_workers: 1
development: true
continuous: false
channel: CURRENT
edition: CORE
photon: false
libraries:
- notebook:
path: ./pipeline.py
schema (Unity Catalog)
The schema resource type allows you to define Unity Catalog schemas for tables and other assets in your workflows and pipelines created as part of a bundle. A schema, different from other resource types, has the following limitations:
- The owner of a schema resource is always the deployment user, and cannot be changed. If
run_as
is specified in the bundle, it will be ignored by operations on the schema. - Only fields supported by the corresponding Schemas object create API are available for the schema resource. For example,
enable_predictive_optimization
is not supported as it is only available on the update API.
The following example defines a pipeline with the resource key my_pipeline
that creates a Unity Catalog schema with the key my_schema
as the target:
resources:
pipelines:
my_pipeline:
name: test-pipeline-{{.unique_id}}
libraries:
- notebook:
path: ./nb.sql
development: true
catalog: main
target: ${resources.schemas.my_schema.id}
schemas:
my_schema:
name: test-schema-{{.unique_id}}
catalog_name: main
comment: This schema was created by DABs.
A top-level grants mapping is not supported by Databricks Asset Bundles, so if you want to set grants for a schema, define the grants for the schema within the schemas
mapping. For more information about grants, see Show, grant, and revoke privileges.
The following example defines a Unity Catalog schema with grants:
resources:
schemas:
my_schema:
name: test-schema
grants:
- principal: users
privileges:
- CAN_MANAGE
- principal: my_team
privileges:
- CAN_READ
catalog_name: main
volume (Unity Catalog)
The volume resource type allows you to define and create Unity Catalog volumes as part of a bundle. When deploying a bundle with a volume defined, note that:
- A volume cannot be referenced in the
artifact_path
for the bundle until it exists in the workspace. Hence, if you want to use Databricks Asset Bundles to create the volume, you must first define the volume in the bundle, deploy it to create the volume, then reference it in theartifact_path
in subsequent deployments. - Volumes in the bundle are not prepended with the
dev_${workspace.current_user.short_name}
prefix when the deployment target hasmode: development
configured. However, you can manually configure this prefix. See Custom presets.
The following example creates a Unity Catalog volume with the key my_volume
:
resources:
volumes:
my_volume:
catalog_name: main
name: my_volume
schema_name: my_schema