Monitor the cost of serverless compute

This article explains how to use the billable usage system table to monitor the cost of your serverless compute usage.

You can monitor the usage of serverless compute for notebooks and jobs by querying the billable usage system table (system.billing.usage), which includes user and workload attributes related to serverless compute costs. The applicable fields include:

  • The identity_metadata column includes the run_as field, which shows the user or service principal whose credentials were used to run the workload.

  • The usage_metadata column has fields that describe the workload: job_run_id, job_name, notebook_id, and notebook_path.

  • The custom_tags column, which will include any tags inherited from budget policies. See Attribute serverless usage with budget policies.

Considerations for serverless usage records

When analyzing your serverless usage, consider the following:

  • You may see multiple records associated with a given serverless compute workload in a given hour. For example, you may see multiple records with the same job_id, job_run_id, or job_name` but with different DBU consumption values for each. The sum of these DBUs collectively represents the hourly DBU consumption for a given job run.

Note

For an update on null workload attribution, see What’s coming?.

Use budgets to monitor spending

Account admins can set up budgets to group costs and set up alerts. See Use budgets to monitor account spending.

Import a usage dashboard

Account admins can import cost management dashboards to any Unity Catalog-enabled workspace in their account. See Import a usage dashboard.

Find a job or notebook in the UI

To find a job or notebook in the UI based on a billing record, copy the usage_metadata.job_id or usage_metadata.notebook_id value from the usage record. These IDs are immutable and can be used even if the job name or notebook path changes.

To find a job in the UI based on its job_id:

  1. Copy the job_id from the usage record. For this example, assume the ID is 700809544510906.
  2. Navigate to the Workflows UI in the same Azure Databricks workspace as the job.
  3. Ensure the Only jobs owned by me filter is unchecked.
  4. Paste the ID (700809544510906) into the Filter jobs search bar.

To find a notebook in the UI based on its notebook_id, use the following instructions:

  1. Copy the notebook_id from the usage record. For this example, assume the ID is 700809544510906.
  2. Navigate to the Workspaces UI in the same Azure Databricks workspace as the notebook.
  3. Click any notebook in the list.
  4. After you’ve opened the notebook, examine the URL in the browser address bar. It should look like https://<account-console-url>/?o=<workspace ID>#notebook/<notebook ID>/command/<command ID>.
  5. In the browser address bar, replace the notebook ID with the ID you copied in the first step, then delete everything after the notebook ID. It should look like https://<account-console-url>/?o=<workspace ID>#notebook/700809544510906.
  6. After you open the notebook, you can click the Share button to view the notebook owner.

Use alerts to track serverless spend

Alerts are a powerful way to stay informed about your serverless spend. With alerts, you can receive notifications when certain conditions are met in your query results. To learn how to create alerts, see Create an alert.

You can add alerts to the following queries to monitor budgets. In each query, replace {budget} with your chosen budget.

Alert when any workspace spend exceeds a threshold in the last 30 days

You can set an alert to trigger whenever this query returns a row. Replace {budget} with your chosen budget.

SELECT
   t1.workspace_id,
   SUM(t1.usage_quantity * list_prices.pricing.default) as list_cost
FROM system.billing.usage t1
INNER JOIN system.billing.list_prices on
   t1.cloud = list_prices.cloud and
   t1.sku_name = list_prices.sku_name and
   t1.usage_start_time >= list_prices.price_start_time and
   (t1.usage_end_time <= list_prices.price_end_time or list_prices.price_end_time is null)
WHERE
   t1.sku_name LIKE '%SERVERLESS%'
   AND billing_origin_product IN ("JOBS", "INTERACTIVE")
   AND t1.usage_date >= CURRENT_DATE() - INTERVAL 30 DAYS
GROUP BY
   t1.workspace_id
HAVING
   list_cost > {budget}

Alert when a user exceeds the threshold in the last 30 days

You can set an alert to trigger whenever this query returns a row. Replace {budget} with your chosen budget.

SELECT
   t1.identity_metadata.run_as,
   SUM(t1.usage_quantity * list_prices.pricing.default) as list_cost
FROM system.billing.usage t1
INNER JOIN system.billing.list_prices on
   t1.cloud = list_prices.cloud and
   t1.sku_name = list_prices.sku_name and
   t1.usage_start_time >= list_prices.price_start_time and
   (t1.usage_end_time <= list_prices.price_end_time or list_prices.price_end_time is null)
WHERE
   t1.sku_name LIKE '%SERVERLESS%'
   AND billing_origin_product IN ("JOBS", "INTERACTIVE")
   AND t1.usage_date >= CURRENT_DATE() - INTERVAL 30 DAYS
GROUP BY
   t1.identity_metadata.run_as
HAVING
   list_cost > {budget}

Alert when a job exceeds the threshold in the last 30 days

You can set an alert to trigger whenever this query returns a row. Replace {budget} with your chosen budget.

SELECT
   t1.workspace_id,
   t1.usage_metadata.job_id,
   SUM(t1.usage_quantity * list_prices.pricing.default) as list_cost
FROM system.billing.usage t1
INNER JOIN system.billing.list_prices on
   t1.cloud = list_prices.cloud and
   t1.sku_name = list_prices.sku_name and
   t1.usage_start_time >= list_prices.price_start_time and
   (t1.usage_end_time <= list_prices.price_end_time or list_prices.price_end_time is null)
WHERE
   t1.sku_name LIKE '%SERVERLESS%'
   AND billing_origin_product IN ("JOBS")
   AND t1.usage_date >= CURRENT_DATE() - INTERVAL 30 DAYS
GROUP BY
   t1.workspace_id, t1.usage_metadata.job_id
HAVING
   list_cost > {budget}

Sample queries

Use the following queries to gain insights into serverless usage in your account:

Identify expensive serverless compute notebooks

This query returns a list of notebooks and how many DBUs each notebook consumed, in descending order by DBU consumption:

SELECT
  usage_metadata.notebook_id,
  usage_metadata. notebook_path,
  SUM(usage_quantity) as total_dbu
FROM
  system.billing.usage
WHERE
  usage_metadata.notebook_id is not null
  and billing_origin_product = 'INTERACTIVE'
  and product_features.is_serverless
  and usage_unit = 'DBU'
  and usage_date >= DATEADD(day, -30, current_date)
GROUP BY
  1,2
ORDER BY
  total_dbu DESC

Identify expensive serverless compute jobs

This query returns a list of jobs and how many DBUs each job consumed, in descending order by DBU consumption:

SELECT
  usage_metadata.job_id,
  usage_metadata.job_name,
  SUM(usage_quantity) as total_dbu
FROM
  system.billing.usage
WHERE
  usage_metadata.job_id is not null
  and usage_unit = 'DBU'
  and usage_date >= DATEADD(day, -30, current_date)
  and sku_name like '%JOBS_SERVERLESS_COMPUTE%'
GROUP BY
  1,2
ORDER BY
  total_dbu DESC

Report on DBUs consumed by a particular user

This query returns a list of notebooks and jobs that use serverless compute run by a particular user or service principal, and the number of DBUs consumed by each workload:

SELECT
  usage_metadata.job_id,
  usage_metadata.job_name,
  usage_metadata.notebook_id,
  usage_metadata. notebook_path,
  SUM(usage_quantity) as total_dbu
FROM
  system.billing.usage
WHERE
  identity_metadata.run_as = '<emailaddress@domain.com>'
  and billing_origin_product in ('JOBS','INTERACTIVE')
  and product_features.is_serverless
  and usage_unit = 'DBU'
  and usage_date >= DATEADD(day, -30, current_date)
GROUP BY
  1,2,3,4
ORDER BY
  total_dbu DESC

Report on serverless compute DBUs consumed by workloads that share a custom tag

This query returns a list of jobs that use serverless compute that share the same custom tag, and the number of DBUs consumed by each workload:

SELECT
  usage_metadata.job_id,
  usage_metadata.job_name,
  usage_metadata.notebook_id,
  usage_metadata. notebook_path,
  SUM(usage_quantity) as total_dbu
FROM
  system.billing.usage
WHERE
  custom_tags.<key> = '<value>'
  and billing_origin_product in ('JOBS','INTERACTIVE')
  and product_features.is_serverless
  and usage_unit = 'DBU'
  and usage_date >= DATEADD(day, -30, current_date)
GROUP BY
  1,2,3,4
ORDER BY
  total_dbu DESC