The Cost of a Row
[This article was contributed by the SQL Azure team.]
In a previous blog post I covered how to calculate the cost of a covered index, allowing you to evaluate if the performance of the covered index was worth its monthly cost. In this blog post I am going to use Transact-SQL to calculate the average cost of a row in a table.
Imagine that you are running a web site that reviews restaurants and you sell advertising space on the web page to generate revenue. The advertising engine can output how much you make in revenue each month for each review. Wouldn’t it be nice to figure out how much each review is costing you in storage? With SQL Azure we can do just that.
SQL Azure Pricing
Currently, SQL Azure charges $9.99 per gigabyte of data per month (official pricing can be found here). That is the cost for the range in which the actually size of data you want to store falls, not the cap size of the database. In other words, if you are storing just a few megabytes on a 1 GB Web edition database, the cost is $9.99 a month. The top side of-the ranges are: 1, 5, 10, 20, 30, 40, and 50 gigabytes – the closer you are to those sizes that lower the cost per byte to store your data. Here is a Transact-SQL statement that will calculate the cost per byte.
DECLARE @SizeInBytes bigint
SELECT @SizeInBytes =
(SUM(reserved_page_count) * 8192)
FROM sys.dm_db_partition_stats
DECLARE @Edition sql_variant
SELECT @Edition =DATABASEPROPERTYEX ( DB_Name() , 'Edition' )
SELECT (CASE
WHEN @SizeInBytes/1073741824.0 < 1 THEN (CASE @Edition WHEN 'Web' THEN 9.99 ELSE 99.99 END)
WHEN @SizeInBytes/1073741824.0 < 5 THEN (CASE @Edition WHEN 'Web' THEN 49.95 ELSE 99.99 END)
WHEN @SizeInBytes/1073741824.0 < 10 THEN 99.99
WHEN @SizeInBytes/1073741824.0 < 20 THEN 199.98
WHEN @SizeInBytes/1073741824.0 < 30 THEN 299.97
WHEN @SizeInBytes/1073741824.0 < 40 THEN 399.96
WHEN @SizeInBytes/1073741824.0 < 50 THEN 499.95
END) / @SizeInBytes
Figuring Out the Row Size
If you are using data types in your rows like: int, bigint, float, etc… these all have fixed sizes and the amount of storage they take up can easily be calculated. However, if you are using variable size fields like varchar(max) there will not be a single row size for all the rows in your table, each row will vary based on what is being stored. For this reason, we are going to take an average row size for the table to compute the row cost.
Along with the storage of the clustered index (the main storage for the table), you need to include the cost of the non-clustered indexes on that table. These indexes rearrange the data for better overall performance; see this blog post about the non-clustered index sizes.
Here is my Transact-SQL that computes the cost per month per row for every table in the database:
DECLARE @SizeInBytes bigint
SELECT @SizeInBytes =
(SUM(reserved_page_count) * 8192)
FROM sys.dm_db_partition_stats
DECLARE @Edition sql_variant
SELECT @Edition =DATABASEPROPERTYEX ( DB_Name() , 'Edition' )
DECLARE @CostPerByte float
SELECT @CostPerByte = (CASE
WHEN @SizeInBytes/1073741824.0 < 1 THEN (CASE @Edition WHEN 'Web' THEN 9.99 ELSE 99.99 END)
WHEN @SizeInBytes/1073741824.0 < 5 THEN (CASE @Edition WHEN 'Web' THEN 49.95 ELSE 99.99 END)
WHEN @SizeInBytes/1073741824.0 < 10 THEN 99.99
WHEN @SizeInBytes/1073741824.0 < 20 THEN 199.98
WHEN @SizeInBytes/1073741824.0 < 30 THEN 299.97
WHEN @SizeInBytes/1073741824.0 < 40 THEN 399.96
WHEN @SizeInBytes/1073741824.0 < 50 THEN 499.95
END) / @SizeInBytes
SELECT
sys.objects.name,
sum(reserved_page_count) * 8192 'Bytes',
row_count 'Row Count',
(CASE row_count WHEN 0 THEN 0 ELSE
(sum(reserved_page_count) * 8192)/ row_count END)
'Bytes Per Row',
(CASE row_count WHEN 0 THEN 0 ELSE
((sum(reserved_page_count) * 8192)/ row_count)
* @CostPerByte END)
'Monthly Cost Per Row'
FROM
sys.dm_db_partition_stats, sys.objects
WHERE
sys.dm_db_partition_stats.object_id = sys.objects.object_id
GROUP BY sys.objects.name, row_count
When I run this against the Adventure Works database loaded into SQL Azure, I get these results:
One thing to notice is that each product in the database is costing me 10 cents a month to store and each sales order header is costing 5 cents. This gives me some good insight into how I want to archive sales information offsite after a number of days, maybe a transaction cost per sale to offset the storage, and to clean up products off that site that are not selling.
Another thing to consider is that the Adventure Works database data is very small about 3 megs, as the data grows (getting closer to the 1 Gigabyte top-side), the cost to store each byte will decrease. So in the Adventure Works database adding products reduces the costs of storage for each product row – as long as you stay under 1 Gigabyte. In other words, fill your database to the maximum to minimize the cost of the bytes store.
Disclaimer
The prices for SQL Azure and the maximum database sizes can change in the future, make sure to compare the current costs against the queries provided to make sure your index costs are accurate.
Summary
Do you have a better way to accomplish the same thing? Post it in the comments below. Do you have questions, concerns, comments? Post them below and we will try to address them.