Billable usage system table reference
This article provides an overview of the billable usage system table, including the schema and example queries. With system tables, your account’s billable usage data is centralized and routed to all regions, so you can view your account’s global usage from whichever region your workspace is in.
For information on using this table to monitor job costs, see Monitor job costs with system tables.
For strategies on analyzing serverless usage, see Monitor the cost of serverless compute.
Table path: This system table is located at system.billing.usage
.
Billable usage table schema
The billable usage system table uses the following schema:
Column name | Data type | Description | Example |
---|---|---|---|
record_id |
string | Unique ID for this record | 11e22ba4-87b9-4cc2 -9770-d10b894b7118 |
account_id |
string | ID of the account this report was generated for | 23e22ba4-87b9-4cc2 -9770-d10b894b7118 |
workspace_id |
string | ID of the Workspace this usage was associated with | 1234567890123456 |
sku_name |
string | Name of the SKU | STANDARD_ALL_PURPOSE_COMPUTE |
cloud |
string | Cloud this usage is relevant for. Possible values are AWS , AZURE , and GCP . |
AWS , AZURE , or GCP |
usage_start_time |
timestamp | The start time relevant to this usage record. Timezone information is recorded at the end of the value with +00:00 representing UTC timezone. |
2023-01-09 10:00:00.000+00:00 |
usage_end_time |
timestamp | The end time relevant to this usage record. Timezone information is recorded at the end of the value with +00:00 representing UTC timezone. |
2023-01-09 11:00:00.000+00:00 |
usage_date |
date | Date of the usage record, this field can be used for faster aggregation by date | 2023-01-01 |
custom_tags |
map | Tags applied to this usage. Includes compute resource tags, jobs tags, workspace custom tags, and budget policy tags. | { “env”: “production” } |
usage_unit |
string | Unit this usage is measured in. Possible values include DBUs. | DBU |
usage_quantity |
decimal | Number of units consumed for this record. | 259.2958 |
usage_metadata |
struct | System-provided metadata about the usage, including IDs for compute resources and jobs (if applicable). See Analyze usage metadata. | {cluster_id: null; instance_pool_id: null; notebook_id: null; job_id: null; node_type: null} |
identity_metadata |
struct | System-provided metadata about the identities involved in the usage. See Analyze identity metadata. | {run_as: example@email.com} |
record_type |
string | Whether the record is original, a retraction, or a restatement. The value is ORIGINAL unless the record is related to a correction. See Analyze correction records. |
ORIGINAL |
ingestion_date |
date | Date the record was ingested into the usage table. |
2024-01-01 |
billing_origin_product |
string | The product that originated the usage. Some products can be billed as different SKUs. For possible values, see View information about the product associated with the usage. | JOBS |
product_features |
struct | Details about the specific product features used. | For possible values, see Product features. |
usage_type |
string | The type of usage attributed to the product or workload for billing purposes. Possible values are COMPUTE_TIME , STORAGE_SPACE , NETWORK_BYTES , API_OPERATION , TOKEN , or GPU_TIME . |
STORAGE_SPACE |
Analyze usage metadata
The values in usage_metadata
tell you about the resources involved in the usage record.
Value | Data type | Description |
---|---|---|
cluster_id |
string | ID of the cluster associated with the usage record |
warehouse_id |
string | ID of the SQL warehouse associated with the usage record |
instance_pool_id |
string | ID of the instance pool associated with the usage record |
node_type |
string | The instance type of the compute resource |
job_id |
string | ID of the job associated with the usage record. Only returns a value for serverless compute or jobs compute usage, otherwise returns null . |
job_run_id |
string | ID of the job run associated with the usage record. Only returns a value for serverless compute or jobs compute usage, otherwise returns null . |
job_name |
string | User-given name of the job associated with the usage record. Only returns a value for jobs run on serverless compute, otherwise returns null . |
notebook_id |
string | ID of the notebook associated with the usage. Only returns a value for serverless compute for notebook usage, otherwise returns null . |
notebook_path |
string | Workspace storage path of the notebook associated with the usage. Only returns a value for serverless compute for notebook usage, otherwise returns null . |
dlt_pipeline_id |
string | ID of the Delta Live Tables pipeline associated with the usage record |
dlt_update_id |
string | ID of the Delta Live Tables pipeline update associated with the usage record |
dlt_maintenance_id |
string | ID of the Delta Live Tables pipeline maintenance tasks associated with the usage record |
run_name |
string | Unique user-facing identifier of the Foundation Model Fine-tuning associated with the usage record |
endpoint_name |
string | The name of the model serving endpoint or vector search endpoint associated with the usage record |
endpoint_id |
string | ID of the model serving endpoint or vector search endpoint associated with the usage record |
central_clean_room_id |
string | ID of the central clean room associated with the usage record |
Analyze identity metadata
The identity_metadata
column can help you identify who is responsible for a serverless billing record. The column includes a run_as
value that attributes the usage to an identity. The identity recorded in identity_metadata.run_as
depends on the product associated with the usage.
Reference the following table for the identity_metadata.run_as
behavior:
Workload type | Identity of run_as |
---|---|
Jobs compute | The user or service principal defined in the run_as setting. By default, jobs run as the identity of the job owner, but admins can change this to be another user or service principal. |
Serverless compute for jobs | The user or service principal defined in the run_as setting. By default, jobs run as the identity of the job owner, but admins can change this to be another user or service principal. |
Serverless compute for notebooks | The user who ran the notebook commands (specifically, the user who created the notebook session). For shared notebooks, this includes usage by other users sharing the same notebook session. |
Delta Live Tables pipelines | The user whose permissions are used to run the Delta Live Tables pipeline. This can be changed by transferring the pipeline’s ownership. |
Foundation Model Fine-tuning | The user or service principal that initiated the fine-tuning training run. |
Analyze correction records
The billing.usage
table supports corrections. Corrections occur when any field of the usage record is incorrect and must be fixed.
When a correction happens, Azure Databricks adds two new records to the table. A retraction record negates the original incorrect record, then a restatement record includes the corrected information. Correction records are identified using the record_type
field:
RETRACTION
: Used to negate the original incorrect usage. All fields are identical to theORIGINAL
record exceptusage_quantity
, which is a negative value that cancels out the original usage quantity. For example, if the original record’s usage quantity was259.4356
, then the retraction record would have a usage quantity of-259.4356
.RESTATEMENT
: The record that includes the correct fields and usage quantity.
For example, the following query returns the correct hourly usage quantity related to a job_id
, even if corrections have been made. By aggregating the usage quantity, the retraction record negates the original record and only the restatement’s values are returned.
SELECT
usage_metadata.job_id, usage_start_time, usage_end_time,
SUM(usage_quantity) as usage_quantity
FROM system.billing.usage
GROUP BY ALL
HAVING usage_quantity != 0
Note
For corrections where the original usage record should not have been written, a correction may only add a retraction record and no restatement record.
View information about the product associated with the usage
Some Databricks products are billed under the same shared SKU. To help you differentiate usage, the billing_origin_product
and product_features
columns provide more insight into the specific product and features associated with the usage.
The billing_origin_product
column shows the Databricks product associated with the usage record. The values include:
JOBS
DLT
SQL
ALL_PURPOSE
MODEL_SERVING
INTERACTIVE
DEFAULT_STORAGE
VECTOR_SEARCH
LAKEHOUSE_MONITORING
PREDICTIVE_OPTIMIZATION
ONLINE_TABLES
FOUNDATION_MODEL_TRAINING
The product_features
column is an object containing information about the specific product features used and includes the following key/value pairs:
jobs_tier
: values includeLIGHT
,CLASSIC
, ornull
sql_tier
: values includeCLASSIC
,PRO
, ornull
dlt_tier
: values includeCORE
,PRO
,ADVANCED
, ornull
is_serverless
: values includetrue
orfalse
, ornull
is_photon
: values includetrue
orfalse
, ornull
serving_type
: values includeMODEL
,GPU_MODEL
,FOUNDATION_MODEL
,FEATURE
, ornull
Sample queries
You can use the following sample queries to answer common questions about billable usage:
- What is the daily trend in DBU consumption?
- How many DBUs of each product have been used throughout this month?
- Which jobs consumed the most DBUs?
- How much usage can be attributed to resources with a certain tag?
- Show me the SKUs where usage is growing
- What is the usage trend of All Purpose Compute (Photon)?
- What is the DBU consumption of a materialized view or streaming table?
- What is the DBU consumption of a serverless DLT pipeline?
What is the daily trend in DBU consumption?
SELECT
usage_date as `Date`, sum(usage_quantity) as `DBUs Consumed`
FROM
system.billing.usage
WHERE
sku_name = "STANDARD_ALL_PURPOSE_COMPUTE"
GROUP BY
usage_date
ORDER BY
usage_date ASC
How many DBUs of each product have been used throughout this month?
SELECT
billing_origin_product,
usage_date,
sum(usage_quantity) as usage_quantity
FROM system.billing.usage
WHERE
month(usage_date) = month(NOW())
AND year(usage_date) = year(NOW())
GROUP BY billing_origin_product, usage_date
Which jobs consumed the most DBUs?
SELECT
usage_metadata.job_id as `Job ID`, sum(usage_quantity) as `Usage`
FROM
system.billing.usage
WHERE
usage_metadata.job_id IS NOT NULL
GROUP BY
`Job ID`
ORDER BY
`DBUs` DESC
How much usage can be attributed to resources with a specific tag?
You can break down costs in various ways. This example shows you how to break down costs by a custom tag. Be sure to replace the custom tag’s key and value in the query.
SELECT
sku_name, usage_unit, SUM(usage_quantity) as `Usage`
FROM
system.billing.usage
WHERE
custom_tags.{{key}} = "{{value}}"
GROUP BY 1, 2
Show me the products where usage is growing
SELECT
after.billing_origin_product, before_dbus, after_dbus, ((after_dbus - before_dbus)/before_dbus * 100) AS growth_rate
FROM
(SELECT
billing_origin_product, sum(usage_quantity) as before_dbus
FROM
system.billing.usage
WHERE
usage_date BETWEEN "2023-04-01" and "2023-04-30"
GROUP BY
billing_origin_product
) as before
JOIN
(SELECT
billing_origin_product, sum(usage_quantity) as after_dbus
FROM
system.billing.usage
WHERE
usage_date
BETWEEN
"2023-05-01" and "2023-05-30"
GROUP BY
billing_origin_product
) as after
WHERE
before.billing_origin_product = after.billing_origin_product
SORT BY
growth_rate DESC
What is the usage trend of All Purpose Compute (Photon)?
SELECT
sku_name,
usage_date,
sum(usage_quantity) as `DBUs consumed`
FROM
system.billing.usage
WHERE
year(usage_date) = year(CURRENT_DATE)
AND
sku_name = "ENTERPRISE_ALL_PURPOSE_COMPUTE_(PHOTON)"
AND
usage_date > "2023-04-15"
GROUP BY
sku_name, usage_date
What is the DBU consumption of a materialized view or streaming table?
To get the DBU usage and SKU for a specific materialized view or streaming table, submit a query to the billable usage system table for records where usage_metadata.dlt_pipeline_id
is set to the ID of the pipeline associated with the materialized view or streaming table. You can find the pipeline ID in the Details tab in Catalog Explorer when viewing the materialized view or streaming table. To optionally limit the consumption by date, specify a start date, end date, or a date range. The following query retrieves the DBU usage for the pipeline with ID 00732f83-cd59-4c76-ac0d-57958532ab5b
and a usage start date of 2023-05-30
:
SELECT
sku_name,
usage_date,
SUM(usage_quantity) AS `DBUs`
FROM
system.billing.usage
WHERE
usage_metadata.dlt_pipeline_id = "00732f83-cd59-4c76-ac0d-57958532ab5b"
AND
usage_start_time > "2023-05-30"
GROUP BY
ALL
What is the DBU consumption of a serverless DLT pipeline?
To get the DBU usage and SKU for a serverless DLT pipeline, submit a query to the billable usage system table for records where usage_metadata.dlt_pipeline_id
is set to the ID of the pipeline. You can find the pipeline ID on the Pipeline Details tab when viewing a pipeline in the Delta Live Tables UI. To optionally limit the consumption by date, specify a start date, end date, or a date range. The following query retrieves the DBU usage from December 2023 for the pipeline with ID 00732f83-cd59-4c76-ac0d-57958532ab5b
.
SELECT
sku_name,
usage_date,
SUM(usage_quantity) AS `DBUs`
FROM
system.billing.usage
WHERE
usage_metadata.dlt_pipeline_id = "00732f83-cd59-4c76-ac0d-57958532ab5b"
AND
usage_start_time >= "2023-12-01"
AND
usage_end_time < "2024-01-01"
GROUP BY
ALL