How the size of an entity is caclulated in Windows Azure table storage?
While working with a partner, I had an opportunity to dig about how Azure Table storage size is calculated with respect to entities. As you may know each entity in Windows Azure Table Storage, can have maximum 1 MB space for each individual entity instance. The following expressions shows how to estimate the amount of storage consumed per entity:
Total Entity Size:
- 4 bytes + Len (PartitionKey + RowKey) * 2 bytes + For-Each Property(8 bytes + Len(Property Name) * 2 bytes + Sizeof(.Net Property Type))
The following is the breakdown:
- 4 bytes overhead for each entity, which includes the Timestamp, along with some system metadata.
- The number of characters in the PartitionKey and RowKey values, which are stored as Unicode (times 2 bytes).
- Then for each property we have an 8 byte overhead, plus the name of the property * 2 bytes, plus the size of the property type as derived from the list below.
The Sizeof(.Net Property Type) for the different types is:
- String – # of Characters * 2 bytes + 4 bytes for length of string
- DateTime – 8 bytes
- GUID – 16 bytes
- Double – 8 bytes
- Int – 4 bytes
- INT64 – 8 bytes
- Bool – 1 byte
- Binary – sizeof(value) in bytes + 4 bytes for length of binary array
So let’s calculate the actual entity size in the following example:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<entry xmlns:d="https://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="https://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns="https://www.w3.org/2005/Atom">
<title />
<updated>2008-09-18T23:46:19.3857256Z<updated/>
<author>
<name />
</author>
<id />
<content type="application/xml">
<m:properties>
<d:Address>Mountain View</d:Address> <= String (14 * 2) + 4 = 32 Bytes + 8 Bytes + Len(“Address”)*2
<d:Age m:type="Edm.Int32">23</d:Age><= Int/Int32 = 4 Bytes+ 8 Bytes + Len(“Age”)*2
<d:AmountDue m:type="Edm.Double">200.23</d:AmountDue><= Double = 8 Bytes+ 8 Bytes + Len(“AmountDue”)*2
<d:BinaryData m:type="Edm.Binary" m:null="true" /><= Binary = 4 Bytes+ 8 Bytes + Len(“BinaryData”)*2
<d:CustomerCode m:type="Edm.Guid">c9da6455-213d-42c9-9a79-3e9149a57833</d:CustomerCode><= GUID = 16 Bytes+ 8 Bytes + Len(“CustomerCode”)*2
<d:CustomerSince m:type="Edm.DateTime">2008-07-10T00:00:00</d:CustomerSince><= DateTime = 8 Bytes+ 8 Bytes + Len(“CustomerSince”)*2
<d:IsActive m:type="Edm.Boolean">true</d:IsActive><= Bool = 1 Bytes+ 8 Bytes + Len(“IsActive”)*2
<d:NumOfOrders m:type="Edm.Int64">255</d:NumOfOrders><= Int64 = 8 Bytes+ 8 Bytes + Len(“NumOfOrders”)*2
<d:PartitionKey>mypartitionkey</d:PartitionKey><= Partition Key (14 * 2) = 28 Bytes
<d:RowKey>myrowkey1</d:RowKey><= Row Key (9 * 2) = 18 Bytes
<d:Timestamp m:type="Edm.DateTime">0001-01-01T00:00:00</d:Timestamp><= DateTime = 8 Bytes+ 8 Bytes + Len(“Timestamp”)*2
</m:properties>
</content>
</entry>
Finally total bytes can be aggregated as below:
1 |
Overhead |
Everything in red font above |
4 Bytes |
2 |
Primary and Row Key |
Everything in Yellow highlighted |
28+18 = 46 Bytes |
3 |
Properties |
Everything in Green highlighted |
Calculate above highlighted bytes |