Improving Windows Azure Table Storage query performance
With Windows Azure Table Storage, the data is stored in a way which can be searched through queries. By design, Windows Azure Table Storage provides the potential to store large amounts of data, while enabling efficient query access because:
- No relations between tables (or entities sets)
- Storing datasets that don't require complex joins, foreign keys, or stored procedures and can be denormalized for fast access
- Quickly querying data using a clustered index
- Tables doesn't have a specified schema
There are a number of good practices to follow when designing entities to get the best performance with your Azure Table Storage. However In this blog I would like to share my best practice as described below:
Query tables using both Partition and Row Key to achieve best performance with Windows Azure Table Storage. |
In Windows Azure Tables, the string PartitionKey and RowKey properties work together as an index for your table. So when using Partition and Row Keys, the storage will use its index to find results really fast, while when using other entity properties the storage will result in table scan, significantly reducing performance. So from performance perspective, querying on the Partition Key is good, querying on both Partition and Row Key is better, querying only on Row Key is not good (will result in table scans).
Listing: Examples of Azure Table Storage query:
a) Querying on both Partition and Row Key: Query results in retrieving specific/selective data: var q = (from movie in context.CreateQuery<Movie>(tableName) where movie.PartitionKey == "Action" && movie.RowKey == "Terminator" select movie);
b) Full Table Scan: Query results in scanning the entire table: var q = (from movie in context.CreateQuery<Movie>(tableName) where movie.RowKey.CompareTo("Sh") >= 0 select movie); |
How to collect Table Storage Metrics data from Windows Azure Management Portal:
Enable storage analytics feature from the Management Portal, which is used to record all logs about the action/behavior in your Azure storage, in order to our further investigation
Configure Table logging and metrics from the Management portal: https://www.windowsazure.com/en-us/manage/services/storage/how-to-monitor-a-storage-account/
The diagnostics logs are saved in a blob container named $logs in your storage account.
If you have enabled the analytics feature in your Table storage, you can use 3rd party Azure Storage application to download and view the results. A list of applications is provided by Windows Azure Storage team in their blog as below:
https://blogs.msdn.com/b/windowsazurestorage/archive/2010/04/17/windows-azure-storage-explorers.aspx
Above information can be saved into a local .csv file for further investigation. In csv file you would be able use metrics specific data to understand more about your Azure Table Storage performance and find proper solutions to improve it as needed. The definition of Azure Table Storage columns are described below: