共用方式為


Retrieving more than 1000 rows from windows azure storage

I recently hit an issue with my code when trying to retrieve
data from Windows Azure Diagnostics table, 
I noticed that I'm getting only 1000 rows every time even though I have
not set any limit on the number of rows I want to retrieve, it turns out that
there is a limit to 1000 rows for windows azure storage.

The limit is not only on the number of rows, there is also a limit on the amount of time it takes the query to execute, the
Table service may return a maximum of 1,000 items at one time and make take only up to 5 seconds. If the result set contains more than 1,000 items or takes too long to execute the table service will return a Continuation Token  that you can use to retrieve the rest of your
rows data, when you received all your data your token is null.

Originally this was my code:

1: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);

    2:   
    3:  var items = from x in wadContext.WADPerformanceCountersTable
    4:  where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
    5:  select x;
    6:   
    7:  foreach (WADPerformanceCountersTable item in items)
    8:  {
    9:     //use item
   10:  }
 
 

This is the code after adding the use for continuation token:

1: ListRowsContinuationToken continuationToken = null;

    2:   
    3:  CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
    4:             
    5:  WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
    6:   
    7:   
    8:  do
    9:  {
   10:      //Query using the start time and end time, take only n rows
   11:      //The response headers will contain a continuation key that can be used to retrieve the next n rows
   12:      var allItems = (from x in wadContext.WADPerformanceCountersTable
   13:          where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
   14:          select x).Take(c_pageSize);
   15:   
   16:      var query = allItems as DataServiceQuery<WADPerformanceCountersTable>;
   17:   
   18:      if (continuationToken != null)
   19:      {
   20:          query = query.AddQueryOption("NextPartitionKey", continuationToken.PartitionKey);
   21:          if (continuationToken.RowKey != null)
   22:          {
   23:             query = query.AddQueryOption("NextRowKey", continuationToken.RowKey);
   24:          }
   25:      }
   26:   
   27:      var response = query.Execute() as QueryOperationResponse;
   28:   
   29:      foreach (WADPerformanceCountersTable item in allItems)
   30:      {
   31:          //use item     
   32:      }
   33:      //now check if there are more rows left to be retrieved, if there are more rows execute another request to get the remaining items
   34:      if (response.Headers.ContainsKey("x-ms-continuation-NextPartitionKey"))
   35:      {
   36:          continuationToken = new ListRowsContinuationToken();
   37:          continuationToken.PartitionKey = response.Headers["x-ms-continuation-NextPartitionKey"];
   38:          if (response.Headers.ContainsKey("x-ms-continuation-NextRowKey"))
   39:          {
   40:             continuationToken.RowKey = response.Headers["x-ms-continuation-NextRowKey"];
   41:          }
   42:      }
   43:      else
   44:      {
   45:          continuationToken = null;
   46:      }
   47:   
   48:  } while (continuationToken != null);
   49:   

 

ListRowsContinuationToken is a simple class defined as
follows:

    1:  public class ListRowsContinuationToken
    2:  {
    3:      public string PartitionKey { get; set; }
    4:      public string RowKey { get; set; }
    5:  }
 
 
 

Finally, I found an even easier way to get this done! The trick is to convert the query into a CloudTableQuery - once you do this you will be able to retrieve all rows with a single call (just make sure that this is what you really want to do)

    1:  CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
    2:  WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
    3:   
    4:  Console.Out.WriteLine("Reading data from WAD storage");
    5:   
    6:  //Query using the start time and end time, take only 1000 rows
    7:  //The response headers will contain a continuration key that can be used to retrieve the next 1000 rows
    8:  var query = (from x in wadContext.WADPerformanceCountersTable
    9:               where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
   10:               select x);
   11:   
   12:   
   13:  var allItemsAsTableService = query.AsTableServiceQuery(); 
   14:  IEnumerable<WADPerformanceCountersTable> allItems = allItemsAsTableService.Execute(); 
   15:   
   16:  foreach (WADPerformanceCountersTable item in allItems)
   17:  {
   18:  //use item
   19:  }
 
 

I chose the last solution because I needed minimal changes to my code, and I didn't really need paging, if you need paging, continuation token will be the way to go!

Comments

  • Anonymous
    April 25, 2011
    Where are you getting the WADPerformanceCountersTableContext & WADPerformanceCountersTable from?  What namespace is this available under?

  • Anonymous
    June 13, 2011
    Hi Matt, You can implement these yourself, public class WADPerformanceCountersTable : TableServiceEntity {    public long EventTickCount { get; set; }    public string DeploymentId { get; set; }    public string Role { get; set; }    public string RoleInstance { get; set; }    public string CounterName { get; set; }    public double CounterValue { get; set; } } public class WADPerformanceCountersTableContext : TableServiceContext {    public WADPerformanceCountersTableContext(string baseAddress, StorageCredentials credentials)        : base(baseAddress, credentials)    { }    public IQueryable<WADPerformanceCountersTable> WADPerformanceCountersTable    {        get        {            return this.CreateQuery<WADPerformanceCountersTable>("WADPerformanceCountersTable");        }    } }

  • Anonymous
    June 06, 2012
    For sake of future readers landing on this page: it's much easier to use AsTableServiceQuery() extension method that encapsulates several query features of Azure Table Service including continuation tokens processing. So initial code can take advantage of all continuation tokens just by replacing the following lines: var items =  from x in wadContext.WADPerformanceCountersTable   where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks   select x; with var items =  (from x in wadContext.WADPerformanceCountersTable   where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks   select x).AsTableServiceQuery();

  • Anonymous
    November 16, 2012
    Hi Sergei, that's the recommended solution: var query = (from x in wadContext.WADPerformanceCountersTable       where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks  select x);   var allItemsAsTableService = query.AsTableServiceQuery();