Configure a serverless DLT pipeline

Статья
03/20/2025

This article describes configurations for serverless DLT pipelines.

Databricks recommends developing new pipelines using serverless. Some workloads might require configuring classic compute or working with the legacy Hive metastore. See Configure compute for a DLT pipeline and Use DLT pipelines with legacy Hive metastore.

Note

Serverless pipelines always use Unity Catalog. Unity Catalog for DLT is in Public Preview and has some limitations. See Use Unity Catalog with your DLT pipelines.
For serverless compute limitations, see Serverless compute limitations.
You cannot manually add compute settings in a clusters object in the JSON configuration for a serverless pipeline. Attempting to do so results in an error.

If you need to use an Azure Private Link connection with your serverless DLT pipelines, contact your Databricks representative.

Requirements

Your workspace must have Unity Catalog enabled to use serverless pipelines.

Your workspace must be in a serverless-enabled region.

Recommended configuration for serverless pipelines

Important

Cluster creation permission is not required to configure serverless pipelines. By default, all workspace users can use serverless pipelines.

Serverless pipelines remove most configuration options, as Azure Databricks manages all infrastructure. To configure a serverless pipeline, do the following:

Click DLT in the sidebar.
Click Create Pipeline.
Provide a unique Pipeline name.
Check the box next to Serverless.
(Optional) Use the file picker to configure notebooks and workspace files as Source code.
- If you don’t add any source code, a new notebook is created for the pipeline. The notebook is created in a new directory in your user directory, and a link to access this notebook is shown in the Source code field in the Pipeline details pane after you’ve created the pipeline.
  - A link to access this notebook is present under the Source code field in the Pipeline details panel once you’ve created your pipeline.
- Use the Add source code button to add additional source code assets.
Select a Catalog to publish data.
Select a Schema in the catalog. All streaming tables and materialized views defined in the pipeline are created in this schema.
Click Create.

These recommended configurations create a new pipeline configured to run in Triggered mode and the Current channel. This configuration is recommended for many use cases, including development and testing, and is well-suited to production workloads that should run on a schedule. For details on scheduling pipelines, see DLT pipeline task for jobs.

You can also convert existing pipelines configured with Unity Catalog to use serverless. See Convert an existing pipeline to use serverless.

Other configuration considerations

The following configuration options are also available for serverless pipelines:

You might choose to use the Continuous pipeline mode when running pipelines in production. See Triggered vs. continuous pipeline mode.
Add Notifications for email updates based on success or failure conditions. See Add email notifications for pipeline events.
Use the Configuration field to set key-value pairs for the pipeline. These configurations serve two purposes:
- Set arbitrary parameters you can reference in your source code. See Use parameters with DLT pipelines.
- Configure pipeline settings and Spark configurations. See DLT properties reference.
Use the Preview channel to test your pipeline against pending DLT runtime changes and trial new features.

Serverless budget policy

Important

This feature is in Public Preview.

Serverless budget policies allow your organization to apply custom tags on serverless usage for granular billing attribution. After you select the Serverless checkbox, the Budget policy setting appears where you can select the policy you want to apply to the pipeline. The tags are inherited from the serverless budget policy and can only be edited by workspace admins.

Note

After you’ve been assigned a serverless budget policy, your existing pipelines are not automatically tagged with your policy. You must manually update existing pipelines if you want to attach a policy to them.

For more on serverless budget policies, see Attribute usage with serverless budget policies.

Serverless pipeline features

In addition to simplifying configuration, serverless pipelines have the following features:

Incremental refresh for Materialized views: Updates for materialized views are refreshed incrementally whenever possible. Incremental refresh has the same results as full recomputation. The update uses a full refresh if results cannot be computed incrementally. See Incremental refresh for materialized views.

Stream pipelining: To improve utilization, throughput, and latency for streaming data workloads such as data ingestion, microbatches are pipelined. In other words, instead of running microbatches sequentially like standard Spark Structured Streaming, serverless DLT pipelines runs microbatches concurrently, improving compute resource utilization. Stream pipelining is enabled by default in serverless DLT pipelines.
Vertical autoscaling: serverless DLT pipelines adds to the horizontal autoscaling provided by Databricks enhanced autoscaling by automatically allocating the most cost-efficient instance types that can run your DLT pipeline without failing because of out-of-memory errors. See What is vertical autoscaling?

What is vertical autoscaling?

Serverless DLT pipelines vertical autoscaling automatically allocates the most cost-efficient available instance types to run your DLT pipeline updates without failing because of out-of-memory errors. Vertical autoscaling scales up when larger instance types are required to run a pipeline update and also scales down when it determines that the update can be run with smaller instance types. Vertical autoscaling determines whether driver nodes, worker nodes, or both driver and worker nodes should be scaled up or down.

Vertical autoscaling is used for all serverless DLT pipelines, including pipelines used by Databricks SQL materialized views and streaming tables.

Vertical autoscaling works by detecting pipeline updates that have failed because of out-of-memory errors. Vertical autoscaling allocates larger instance types when these failures are detected based on the out-of-memory data collected from the failed update. In production mode, a new update that uses the new compute resources is started automatically. In development mode, the new compute resources are used when you manually start a new update.

If vertical autoscaling detects that the memory of the allocated instances is consistently underutilized, it will scale down the instance types to use in the next pipeline update.

Convert an existing pipeline to use serverless

You can convert existing pipelines configured with Unity Catalog to serverless pipelines. Complete the following steps:

Click DLT in the sidebar.
Click the name of the desired pipeline in the list.
Click Settings.
Check the box next to Serverless.
Click Save and start.

Important

When you enable serverless, any compute settings you have configured for a pipeline are removed. If you switch a pipeline back to non-serverless updates, you must reconfigure the desired compute settings to the pipeline configuration.

How can I find the DBU usage of a serverless pipeline?

You can find the DBU usage of serverless DLT pipelines by querying the billable usage table, part of the Azure Databricks system tables. See What is the DBU consumption of a serverless DLT pipeline?.

Поделиться через

Configure a serverless DLT pipeline

Requirements

Recommended configuration for serverless pipelines

Other configuration considerations

Serverless budget policy

Serverless pipeline features

What is vertical autoscaling?

Convert an existing pipeline to use serverless

How can I find the DBU usage of a serverless pipeline?

Обратная связь

Дополнительные ресурсы