Retrieving more than 1000 rows from windows azure storage
I recently hit an issue with my code when trying to retrieve
data from Windows Azure Diagnostics table,
I noticed that I'm getting only 1000 rows every time even though I have
not set any limit on the number of rows I want to retrieve, it turns out that
there is a limit to 1000 rows for windows azure storage.
The limit is not only on the number of rows, there is also a limit on the amount of time it takes the query to execute, the
Table service may return a maximum of 1,000 items at one time and make take only up to 5 seconds. If the result set contains more than 1,000 items or takes too long to execute the table service will return a Continuation Token that you can use to retrieve the rest of your
rows data, when you received all your data your token is null.
Originally this was my code:
1: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
2:
3: var items = from x in wadContext.WADPerformanceCountersTable
4: where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
5: select x;
6:
7: foreach (WADPerformanceCountersTable item in items)
8: {
9: //use item
10: }
This is the code after adding the use for continuation token:
1: ListRowsContinuationToken continuationToken = null;
2:
3: CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
4:
5: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
6:
7:
8: do
9: {
10: //Query using the start time and end time, take only n rows
11: //The response headers will contain a continuation key that can be used to retrieve the next n rows
12: var allItems = (from x in wadContext.WADPerformanceCountersTable
13: where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
14: select x).Take(c_pageSize);
15:
16: var query = allItems as DataServiceQuery<WADPerformanceCountersTable>;
17:
18: if (continuationToken != null)
19: {
20: query = query.AddQueryOption("NextPartitionKey", continuationToken.PartitionKey);
21: if (continuationToken.RowKey != null)
22: {
23: query = query.AddQueryOption("NextRowKey", continuationToken.RowKey);
24: }
25: }
26:
27: var response = query.Execute() as QueryOperationResponse;
28:
29: foreach (WADPerformanceCountersTable item in allItems)
30: {
31: //use item
32: }
33: //now check if there are more rows left to be retrieved, if there are more rows execute another request to get the remaining items
34: if (response.Headers.ContainsKey("x-ms-continuation-NextPartitionKey"))
35: {
36: continuationToken = new ListRowsContinuationToken();
37: continuationToken.PartitionKey = response.Headers["x-ms-continuation-NextPartitionKey"];
38: if (response.Headers.ContainsKey("x-ms-continuation-NextRowKey"))
39: {
40: continuationToken.RowKey = response.Headers["x-ms-continuation-NextRowKey"];
41: }
42: }
43: else
44: {
45: continuationToken = null;
46: }
47:
48: } while (continuationToken != null);
49:
ListRowsContinuationToken is a simple class defined as
follows:
1: public class ListRowsContinuationToken
2: {
3: public string PartitionKey { get; set; }
4: public string RowKey { get; set; }
5: }
Finally, I found an even easier way to get this done! The trick is to convert the query into a CloudTableQuery - once you do this you will be able to retrieve all rows with a single call (just make sure that this is what you really want to do)
1: CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
2: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
3:
4: Console.Out.WriteLine("Reading data from WAD storage");
5:
6: //Query using the start time and end time, take only 1000 rows
7: //The response headers will contain a continuration key that can be used to retrieve the next 1000 rows
8: var query = (from x in wadContext.WADPerformanceCountersTable
9: where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
10: select x);
11:
12:
13: var allItemsAsTableService = query.AsTableServiceQuery();
14: IEnumerable<WADPerformanceCountersTable> allItems = allItemsAsTableService.Execute();
15:
16: foreach (WADPerformanceCountersTable item in allItems)
17: {
18: //use item
19: }
I chose the last solution because I needed minimal changes to my code, and I didn't really need paging, if you need paging, continuation token will be the way to go!
Comments
Anonymous
April 25, 2011
Where are you getting the WADPerformanceCountersTableContext & WADPerformanceCountersTable from? What namespace is this available under?Anonymous
June 13, 2011
Hi Matt, You can implement these yourself, public class WADPerformanceCountersTable : TableServiceEntity { public long EventTickCount { get; set; } public string DeploymentId { get; set; } public string Role { get; set; } public string RoleInstance { get; set; } public string CounterName { get; set; } public double CounterValue { get; set; } } public class WADPerformanceCountersTableContext : TableServiceContext { public WADPerformanceCountersTableContext(string baseAddress, StorageCredentials credentials) : base(baseAddress, credentials) { } public IQueryable<WADPerformanceCountersTable> WADPerformanceCountersTable { get { return this.CreateQuery<WADPerformanceCountersTable>("WADPerformanceCountersTable"); } } }Anonymous
June 06, 2012
For sake of future readers landing on this page: it's much easier to use AsTableServiceQuery() extension method that encapsulates several query features of Azure Table Service including continuation tokens processing. So initial code can take advantage of all continuation tokens just by replacing the following lines: var items = from x in wadContext.WADPerformanceCountersTable where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks select x; with var items = (from x in wadContext.WADPerformanceCountersTable where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks select x).AsTableServiceQuery();Anonymous
November 16, 2012
Hi Sergei, that's the recommended solution: var query = (from x in wadContext.WADPerformanceCountersTable where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks select x); var allItemsAsTableService = query.AsTableServiceQuery();