Connect to serverless compute
This article explains the multiple serverless offerings available on Azure Databricks. Serverless compute allows you to quickly connect to on-demand computing resources.
The articles in this section focus on serverless compute for notebooks, jobs, and Delta Live Tables. For information on serverless SQL warehouses, see What are Serverless SQL warehouses?. For information on Model Serving, see Deploy models using Mosaic AI Model Serving.
For information on serverless compute plane architecture, see Serverless compute plane.
What is serverless compute?
Serverless compute allows you to run workloads without provisioning a cluster. Instead, Databricks automatically allocates and manages the necessary compute resources. This enables you to focus on writing code and analyzing data, without worrying about cluster management or resource utilization.
Serverless compute offers the following benefits:
- Cloud resources are managed by Azure Databricks, reducing management overhead and providing instant compute to enhance user productivity.
- Rapid start-up and scaling times for serverless compute resources minimize idle time and ensure you only pay for the compute you use.
- Because capacity handling, security, patching, and upgrades are managed automatically, you can worry less about reliability, security policies, and capacity shortages.
What types of serverless compute are available on Azure Databricks?
Azure Databricks currently offers the following types of serverless compute:
- Serverless compute for notebooks: On-demand, scalable compute used to execute SQL and Python code in notebooks.
- Serverless compute for jobs: On-demand, scalable compute used to run your Databricks jobs without configuring and deploying infrastructure.
- Serverless SQL warehouses: On-demand elastic compute used to run SQL commands on data objects in the SQL editor or interactive notebooks. You can create SQL warehouses using the UI, CLI, or REST API.
- Serverless DLT pipelines: Optimized and scalable compute for your Delta Live Tables pipeline updates.
- Mosaic AI Model Serving: Highly available and low-latency service for deploying AI models.
- Mosaic AI Model Training - forecasting: Use AutoML to choose the best forecasting algorithm and hyperparameters based on a user-provided dataset.
Enable serverless compute
To access serverless compute for notebooks, jobs, and Delta Live Tables, an account admin might need to enable the feature. See Enable serverless compute.
To access serverless SQL warehouses, see Enable serverless SQL warehouses.
Serverless compute limitations
For a list of limitations, see Serverless compute limitations.
Frequently asked questions (FAQ)
- How are releases rolled out?
- How do I determine which serverless version I am running?
- How do I estimate costs for serverless?
- How do I analyze DBU usage for a specific workload?
- Is there a delay between when you run a job or query and the appearance of charges on the billable usage system table?
- I haven’t enabled serverless compute for jobs and notebooks, why do I see billing records for serverless jobs?
- Does serverless compute support private repos?
- How do I install libraries for my job tasks?
- Can I connect to custom data sources?
- How does the serverless compute plane networking work?
- Can I configure serverless compute for jobs with Databricks Asset Bundles?
How are releases rolled out?
Serverless compute is a versionless product, which means that Databricks automatically upgrades the serverless compute runtime to support enhancements and upgrades to the platform. All users get the same updates, rolled out over a short period of time.
How do I determine which serverless version I am running?
Serverless workloads always run on the latest runtime version. See Release notes for the most recent version.
How do I estimate costs for serverless?
Databricks recommends running and benchmarking a representative or specific workload and then analyzing the billing system table. See Billable usage system table reference.
How do I analyze DBU usage for a specific workload?
To see the cost for a specific workload, query the system.billing.usage
system table. See Monitor the cost of serverless compute for sample queries and to download our cost observability dashboard.
Is there a delay between when you run a job or query and the appearance of charges on the billable usage system table?
Yes, there could be up to a 24-hour delay between when you run a workload and its usage being reflected in the billable usage system table.
I haven’t enabled serverless compute for jobs and notebooks, why do I see billing records for serverless jobs?
Lakehouse Monitoring and predictive optimization are also billed under the serverless jobs SKU.
Serverless compute does not have to be enabled to use these two features.
Does serverless compute support private repos?
Repositories can be private or require authentication. For security reasons, a pre-signed URL is required when accessing authenticated repositories.
How do I install libraries for my job tasks?
Databricks recommends using environments to install and manage libraries for your jobs. See Configure environments and dependencies for non-notebook tasks.
Can I connect to custom data sources?
No, only sources that use Lakehouse Federation are supported. See Supported data sources.
How does the serverless compute plane networking work?
Serverless compute resources run in the serverless compute plane, which is managed by Azure Databricks. For more details on the network and architecture, see Serverless compute plane networking.
Can I configure serverless compute for jobs with Databricks Asset Bundles?
Yes, Databricks Asset Bundles can be used to configure jobs that use serverless compute. See Configure a job that uses serverless compute.