Redigera

Dela via


Estimate table size

Applies to: ✅ Microsoft FabricAzure Data Explorer

Understanding the size of a table can be helpful for efficient resource management and optimized query performance. In this article, you'll learn different methods to estimate table sizes and how to use them effectively.

Original size of ingested data

Use the .show table details to estimate the original data size of a table. For an example, see Use .show table details.

This command provides an estimation of the uncompressed size of data ingested into your table based on the assumption that the data was transferred in CSV format. The estimation is based on approximate lengths of numeric values, such as integers, longs, datetimes, and guids, by considering their string representations.

Example use case: Track the size of incoming data over time to make informed decisions about capacity planning.

Table size in terms of access bytes

Use the estimate_data_size() along with the sum() aggregation function to estimate table size based on data types and their respective byte sizes. For an example, see Use estimate_data_size().

This method provides a more precise estimation by considering the byte sizes of numeric values without formatting them as strings. For example, integer values require 4 bytes whereas long and datetime values require 8 bytes. By using this approach, you can accurately estimate the data size that would fit in memory.

Example use case: Determine the cost of a query in terms of bytes to be scanned.

Combined size of multiple tables

You can use the union operator along with the estimate_data_size() and sum() functions to estimate the combined size of multiple tables in terms of access bytes. For an example, see Use union with estimate_data_size().

Example use case: Assess the memory requirements for consolidating data from multiple tables into a single dataset.

Note

This approach may inflate the estimated data size due to empty columns, as union combines all columns from the specified tables and estimate_data_size() takes into account empty columns when calculating the data size.

Examples

Use .show table details

The following query estimates the original data size of the StormEvents table.

.show table StormEvents details
| project TotalOriginalSize

Output

TotalOriginalSize
60192011

Tip

To format the bytes result to MB, GB, or another unit, use format_bytes().

Use estimate_data_size()

The following query estimates the original data size of the StormEvents table in bytes.

::: moniker-end

StormEvents
| extend sizeEstimateOfColumn = estimate_data_size(*)
| summarize totalSize=sum(sizeEstimateOfColumn)

Output

totalSize
58608932

Note

The output is smaller even though the calculation is done over the same table. This is because this method provides a more precise estimation by considering the byte sizes of numeric values without formatting them as strings.

Use union with estimate_data_size()

The following query estimates the data size based for all tables in the Samples database.

union withsource=_TableName *
| extend sizeEstimateOfColumn = estimate_data_size(*)
| summarize totalSize=sum(sizeEstimateOfColumn)
| extend sizeGB = format_bytes(totalSize,2,"GB")
totalSize sizeGB
1761782453926 1640.79 GB