Jaa


Windows Azure Storage Metrics: Using Metrics to Track Storage Usage

Windows Azure Storage Metrics allows you to track your aggregated storage usage for Blobs, Tables and Queues. The details include capacity, per service request summary, and per API level aggregates. The metrics information is useful to see aggregate view of how a given storage account’s blobs, tables or queues are doing over time.  It makes it very easy to see the types of errors that are occurring to help tune your system and diagnose problems and the ability to see daily trends of its usage.  For example, the metrics data can be used to understand the request breakdown (by hour).  

The metrics data can be categorized as:

  • Capacity: Provides information regarding the storage capacity consumed for the blob service, the number of containers and total number of objects stored by the service. In the current version, this is available only for the blob service. We will provide table and queue capacity information in a future version. This data is updated daily and it provides separate capacity information for data stored by user and the data stored for $logs.
  • Requests: Provides summary information of requests executed against the service. It provides total number of requests, total ingress/egress, average E2E latency and server latency, total number of failures by category, etc. at an hourly granularity. The summary is provided at service level and it also provides aggregates at an API level for the APIs that have been used for the hour. This data is available for all the three services provided – Blobs, Tables, and Queues.

Finding your metrics data

Metrics are stored in tables in the storage account the metrics are for. The capacity information is stored in a separate table from the request information. These tables are automatically created when you opt in for metrics analytics and once created the tables cannot be deleted, though their contents can be.

As mentioned above, there are two types of tables which store the metrics details:

  1. Capacity information.  When turned on, the system creates $MetricsCapacityBlob table to store capacity information for blobs. This includes the number of objects and capacity consumed by Blobs for the storage account.  This information will be recorded in $MetricsCapacityBlob table once per day (see PartitionKey column in Table 1).  We report capacity separately for the amount of data stored by the user and the amount of data stored for analytics.   
  2. Transaction metrics.  This information is available for all the services – Blobs, Tables and Queues. Each service gets a table for itself and hence the 3 tables are:
    • $MetricsTransactionsBlob: This table will contain the summary – both at a service level and for each API issued to the blob service.
    • $MetricsTransactionsTable: This table will contain the summary – both at a service level and for each API issued to the table service.
    • $MetricsTransactionsQueue: This table will contain the summary – both at a service level and for each API issued to the queue service.

A transaction summary row is stored for each service that depicts the total request count, error counts, etc., for all requests issued for the service at an hour granularity.  Aggregated request details are stored per service and for each API issued in the service per hour.

Requests will be categorized separately for:

  • Requests issued by the user (authenticated, SAS and anonymous) to access both user data and analytics data in their storage account.
  • Requests issued by Windows Azure Storage to populate the analytics data.

It is important to note that the system stores a summary row for each service every hour even if no API was issued for the service in that hour, which would result in a row showing that there were 0 requests during the hour.  This helps applications in issuing point queries when monitoring the data to perform analysis.  For APIs level metrics the behavior is different; the system stores a summary row for individual APIs only if the API was utilized in that hour.

These tables can be read from like any other user created Windows Azure Table. The REST URI for the metrics table is https://<accountname>.table.core.windows.net/Tables(“$MetricsTransactionsBlob”) to access the Blob Transaction table and https://<accountname>.table.core.windows.net/$MetricsTransactionsBlob() to access rows stored for Blob service in the transaction table.

What does my metrics data contain?

As mentioned, there are two different types of tables for the metrics data.  In this section we will review the details contained in each table.

Blob capacity metrics data

This is stored in the $MetricsCapacityBlob Table.   Two records with capacity information will be stored in this table for each day. The two records are used to track the capacity consumed by the data that the user has stored separately from the logging data in $logs container.

Metrics ($MetricsCapacityBlob Table)

Column Name

Description

PartitionKey (string)

<Timestamp>

Timestamp is of the form: YYYYMMddThhmm where hhmm is the starting hour in 24 hour format. The hour and minutes is always stored as 0000 (i.e. a single record is stored once a day for 00:00 hours UTC).

NOTE: This represents the day for which the capacity is calculated and not when the record is inserted into the table.

RowKey (string)

It can have one of the two possible values:

data: The capacity row represents the capacity consumed by data stored by the user in blobs

analytics: The capacity row represents the analytics data stored in blobs, which is all of the capacity consumed by logs stored in $logs

Capacity (long)

The capacity in bytes utilized by the account for the blob service.

ContainerCount (long)

The total number of containers.

ObjectCount (long)

The total number of blobs.

Table 1 Capacity Metrics table

Transaction metrics data

This is stored in the $MetricsTransactions<service> tables.All three (blob, table, queue) of the services have the same table schema, outlined below.  At the highest level, these tables contain two kinds of records:

  • Service Summary – This record contains hourly aggregates for the entire service. A record is inserted for each hour even if the service has not been used. The aggregates include request count, success count, failure counts for certain error categories, average E2E latency, average Server latency etc.
  • API Summary – This record contains hourly aggregates for the given API. The aggregates include request count, success count, failure count by category, average E2E latency, average server latency, etc. A record is stored in the table for an API only if the API has been issued to the service in that hour.

We track user and Windows Azure Storage Analytics (system) requests in different categories.

User Requests to access any data under the storage account (including analytics data) are tracked under “user” category. These requests types are:

  • Authenticated Requests
  • Authenticated SAS requests. All failures after a successful authentication will be tracked including authorization failures. An example of Authorization failure in SAS scenario is trying to write to a blob when only read access has been provided.
  • Anonymous Requests - Only successful anonymous requests are tracked. The only exception to this are the following which are tracked:
    • Server errors – Typically seen as InternalServerError
    • Client and Server Timeout errors
    • Requests that fail with 304 i.e. Not Modified

Internal Windows Azure Storage Analytics requests are tracked under the “system” category:

  • Requests to write logs  under the $logs container
  • Requests to write to metrics table

These are tracked as “system” so that they are separated out from the actual requests that users issue to access their data or analytics data.

Note that if a request fails before the service can deduce the type of request (i.e. the API name of the request), then the request is recorded under the API name of “Unknown”.

Before we go over the schema for the metrics transaction tables (the table columns and their definitions), we will define some terms used in this table:

  • For service summary row, the column represents the aggregate data at a service level. For API summary row, the same column represents data at an API level granularity.
  • Counted For Availability:  If we have listed an operation result count in the below table as “Counted For Availability”, then it implies that the result category is accounted for in the availability calculation.  Availability is defined as (‘billable requests)/(total requests).  These requests are counted in both the numerator and denominator of that calculation. Examples: Success, ClientOtherError etc.
  • Counted Against Availability: These requests are counted in the Availability denominator only (were not considered as ‘billable requests).  This implies that requests are counted only in the denominator and will impact the availability (for example, server timeout errors).
  • Billable: These are the requests that are billable.  See this post for more information on understanding Windows Azure Storage billing.    Example: ServerTimeoutError, ServerOtherError.

NOTE: If a result category is not listed as “Counted For/Against Availability”, then it implies that those requests are not considered in the availability calculations. Examples: ThrottlingError

Metrics tables schema  ($MetricsTransactionsBlob, $MetricsTransactionsTable,

$MetricsTransactionsQueue Tables)

Column Name

Description

Time in UTC is the PartitionKey (string)

<Timestamp>

Basic timestamp format described in ISO 8601: YYYYMMddThhmm. The timestamp represents the starting hour for the metrics information, The minutes will always be “00”

This can be used to search for request statistics for a given time period.

Request Type and Transaction type make up the RowKey (string)

<user|system>;<API-Name>

The prefix signifies whether the summary/API represents requests issued by user or system.

  • user: Tracks  summary for all requests issued by clients i.e. authenticated, SAS or anonymous requests
  • system: Requests issued by Windows Azure Storage to populate analytics data

The API-Name suffix can be one of the following:

  • “All” for service summary rows containing a summary of requests made to the service. 
  • The API name is used to indicate the api summary for the given API. See below for the list of names.

NOTE: Under “system” category we will only have summary metrics and not  API granularity.

  • AcquireLease
  • BreakLease
  • ClearPage
  • CopyBlob
  • CreateContainer
  • DeleteBlob
  • DeleteContainer
  • GetBlob
  • GetBlobMetadata
  • GetBlobProperties
  • GetBlockList
  • GetContainerACL
  • GetContainerMetadata
  • GetContainerProperties
  • GetLeaseInfo
  • GetPageRegions
  • LeaseBlob
  • ListBlobs
  • ListContainers
  • PutBlob
  • PutBlockList
  • PutBlock
  • PutPage
  • ReleaseLease
  • RenewLease
  • SetBlobMetadata
  • SetBlobProperties
  • SetContainerACL
  • SetContainerMetadata
  • SnapshotBlob
  • SetBlobServiceProperties
  • GetBlobServiceProperties

  Queue Service APIs

  • ClearMessages
  • CreateQueue
  • DeleteQueue
  • DeleteMessage
  • GetQueueMetadata
  • GetQueue
  • GetMessage
  • GetMessages
  • ListQueues
  • PeekMessage
  • PeekMessages
  • PutMessage
  • SetQueueMetadata
  • SetQueueServiceProperties
  • GetQueueServiceProperties
  • EntityGroupTransaction
  • CreateTable
  • DeleteTable
  • DeleteEntity
  • InsertEntity
  • QueryEntity
  • QueryEntities
  • QueryTable
  • QueryTables
  • UpdateEntity
  • MergeEntity
  • SetTableServiceProperties
  • GetTableServiceProperties
  • EntityGroupTransaction
  • CreateTable
  • DeleteTable
  • DeleteEntity
  • InsertEntity
  • QueryEntity
  • QueryEntities
  • QueryTable
  • QueryTables
  • UpdateEntity
  • MergeEntity
  • SetTableServiceProperties
  • GetTableServiceProperties

In addition “Unknown” may be used if the failure occurred before the exact API was detected

TotalIngress (long)

The total ingress in bytes utilized by the service/ API.

TotalEgress (long)

The total egress in bytes utilized by the service/API.

This is not the same as billable egress and is just a sum total of egress from all requests. Customers do not get charged for bandwidth within the same datacenter, which is counted in this metric.

TotalRequests (long)

The total requests issued to the service/ API.

This includes throttled requests, expected and unexpected timeouts, all anonymous failures and all SAS failures.

= ∑( Success, AnonymousSuccess, SASSuccess,

ThrottlingError, AnonymousThrottlingError, SASThrottlingError,

NetworkError, AnonymousNetworkError, SASNetworkError,

ClientOtherError, AnonymousClientOtherError, SASClientOtherError,

ServerOtherError, AnonymousServerOtherError, SASServerOtherError,

ClientTimeoutError, AnonymousClientTimeoutError, SASClientTimeoutError,

ServerTimeoutError, AnonymousServerTimeoutError, SASServerTimeoutError,

AuthorizationError, AnonymousAuthorizationError SASAuthorizationError)

TotalBillableRequests (long)

The total requests that are billable. This is used to calculate availability.

NOTE: we do not bill anonymous failures (except network errors), throttled requests, server timeout errors and unknown errors.

= ∑( Success, AnonymousSuccess, SASSuccess,

NetworkError, AnonymousNetworkError, SASNetworkError,

ClientOtherError, AnonymousClientOtherError, SASOtherError,

ClientTimeoutError, SASClientTimeoutError, AnonymousClientTimeoutError,

AuthorizationError, SASAuthorizationError)

Availability (double)

This is the availability of the storage account as determined by the service for a given hour. 

The Availability considers ServerTimeoutError and ServerOtherError as failures.

Availability = (X/Y) * 100

Where X = TotalBillableRequests and

Y  = ∑( Success, AnonymousSuccess, SASSuccess,

NetworkError, AnonymousNetworkError, SASNetworkError,

ClientOtherError, SASClientOther, AnonymousClientOtherError,

ServerOtherError, AnonymousServerOtherError, SASServerOtherError,

ClientTimeoutError, SASClientTimeoutError, AnonymousClientTimeoutError,

ServerTimeout, AnonymousServerTimeout, SASServerTimeout,

AuthorizationError, SASAuthorizationError)

Please note that the SLA for a storage account is the average availability for all requests across all hours across all services (blobs, tables and queues) for an entire month.  Whereas, this availability metric is just for the specific service or API for a specific hour.

AverageE2ELatency (double)

The average end-to-end latency of successful requests made to a storage service or the specified API operation. This value includes the required processing time within Windows Azure Storage to read the request, send the response, and receive acknowledgement of the response.

AverageServerLatency (double)

The average latency used by Windows Azure Storage to process a request, excluding failed requests. This value does not include the network latency which is included in AverageE2ELatency.

PercentSuccess (double)

The percentage of requests that succeeded.

PercentThrottlingError (double)

The percentage of requests that failed with throttling.

PercentTimeoutError (double)

The percentage of requests that failed with timeout errors (client and server timeouts). This does not differentiate between server and client timeouts.

PercentServerOtherError (double)

The percentage of requests that failed with status code 500 i.e. Internal Server Error where the storage error code is not Timeout.

PercentClientOtherError (double)

The percentage of requests that failed with errors such as NotFound, Precondition Failed etc.

Most 3XX and 4XX failures fall under this category.

PercentAuthorizationError (double)

The percentage of requests that failed with authorization errors.

PercentNetworkError (double)

The percentage of requests that failed with network errors.

Success (long)

The total number of requests that were successful against the service/API.

These will also include all conditional GET requests that did not return because the condition did not succeed.

These are billable requests and counted for availability.

AnonymousSuccess (long)

The total number of anonymous requests that were successful against the service/ API.

These will also include all conditional GET requests that did not return because the condition did not succeed.

These are billable requests and counted for availability.

SASSuccess (long)

The total number of SAS requests that were successful against the service/API.

These will also include all conditional GET requests that did not return because the condition did not succeed.

These are billable requests and counted for availability.

ThrottlingError (long)

The number of authenticated requests that returned ServerBusy i.e. 503 status code.

These are not billable and are not counted for availability.

AnonymousThrottlingError (long)

The number of anonymous requests that returned ServerBusy i.e. 503 status code.

These are not billable and are not counted for availability.

SASThrottlingError (long)

The number of SAS requests that returned ServerBusy i.e. 503 status code.

These are not billable and are not counted for availability.

ClientTimeoutError (long)

The total number of authenticated requests that timed out against the service/API.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

This happens when timeout value provided is not sufficient for the IO over the network. For example, if the read/write/etc. request completes in the expected time on the server but it takes a long time to return to the client due to network latency, this is considered as a client timeout. 

Any other timeout will be deemed as ServerTimeout.

These are billable requests and counted for availability.

AnonymousClientTimeoutError (long)

The total number of anonymous requests that timed out against the service/API.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

This happens when timeout value provided is not sufficient for the IO of the network. For example, if the read/write/etc. request completes in the expected time on the server but it takes a long time to return to the client due to network latency, this is considered as a client timeout. 

Any other timeout will be deemed as AnonymousServerTimeout.

These are billable requests and counted for availability.

SASClientTimeoutError (long)

The total number of SAS requests that timed out against the service/API.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

This happens when timeout value provided is not sufficient for the IO of the network. For example, if the read/write/etc. request completes in the expected time on the server but it takes a long time to return to the client due to network latency, this is considered as a client timeout. 

Any other timeout will be deemed as SASServerTimeout.

These are billable requests and counted for availability.

ServerTimeoutError (long)

The total number of authenticated requests that timed out against the service/API because the service took a longer time to process the request. The time taken by service excludes the time to read/write from/to client over the network.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

These are not billable requests and counted against availability.

AnonymousServerTimeoutError (long)

The total number of anonymous requests that timed out against the service/API because the service took a longer time to process the request. The time taken by service excludes the time to read/write from/to client over the network.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

These are not billable requests and counted against availability.

SASServerTimeoutError (long)

The total number of requests that timed out against the service/API because the service took a longer time to process the request. The time taken by service excludes the time to read/write from/to client over the network.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

These are not billable requests and counted against availability.

ClientOtherError (long)

The total number of authenticated requests that failed as expected against the service/API.

Examples are Resource already exists, Resource not found etc.

These are billable requests and counted for availability.

SASClientOtherError (long)

The total number of SAS requests that failed as expected against the service/API.

Examples are Resource already exists, Resource not found etc.

These are billable requests and counted for availability.

AnonymousClientOtherError (long)

The total number of anonymous requests that failed precondition checks (like If- Modified etc.) for GET requests.

Examples: Conditional GET requests that fail the check.

These are billable requests and counted for availability.

ServerOtherError (long)

The total number of authenticated requests that failed due to unknown server errors.

These are typically Http Status code 500 with Storage error code other than Timeout.

These are not billable requests and counted against availability.

AnonymousServerOtherError (long)

The total number of anonymous requests that failed due to unknown server errors.

These are typically Http Status code 500 with Storage error code other than Timeout.

These are not billable requests and counted against availability.

SASServerOtherError (long)

The total number of SAS requests that failed due to unknown server errors.

These are typically Http Status code 500 with Storage error code other than Timeout.

These are not billable requests and counted against availability.

AuthorizationError (long)

The total number of authenticated requests that failed because authorization failed or anonymous access against objects that are not public.

Example: write requests from users to  logs under $logs will be treated as Authorization error.

These are billable requests and counted for availability.

AnonymousAuthorizationError (long)

The total number of anonymous requests that failed because authorization failed or anonymous access against objects that are not public.

Example: only authenticated write requests are allowed.

These are billable requests and counted for availability.

SASAuthorizationError (long)

The total number of SAS requests that failed because authorization failed or anonymous access against objects that are not public.

Example: write requests using SAS when only read access was provided Authorization error.

These are billable requests and counted for availability.

NetworkError (long)

The total number of authenticated requests that failed because of network errors.

Network errors occur when a user prematurely closes the connection before the timeout expires or if there are problems in any of the intermediate switches.

These are billable requests and counted for availability.

AnonymousNetworkError (long)

The total number of anonymous requests that failed because of network errors.

Network errors occur when a user prematurely closes the connection before the timeout expires or if there are problems in any of the intermediate switches.

These are billable requests and counted for availability.

SASNetworkError (long)

The total number of SAS requests that failed because of network errors.

Network errors occur when a user prematurely closes the connection before the timeout expires or if there are problems in any of the intermediate switches.

These are billable requests and counted for availability.

Table 2 Schema for transaction metrics tables for Blobs, Tables and Queues

How do I cleanup old Metrics data?

As described above, we highly recommend you set a retention policy for your analytics data. If set, the maximum retention policy allowed is 365 days. Once a retention policy is set, the system will delete the records in the metrics tables and log blobs from the $logs container.  The retention period can be set different for logs from metrics data. For example: If a user sets the retention policy for metrics to be 100 days and 30 days for logs, then all the logs for queue, blob, and table services will be deleted after 30 days and records stored in the associated tables will be deleted if the content is > 100 days. Retention policy is enforced even if analytics is turned off but retention policy is enabled.  If you do not set a retention policy you can manage your data by manually deleting entities (like you delete entities in regular tables) whenever you wish to do so.

Searching your analytics metrics data

Your metrics can be retrieved using the Query Tables and Query Entities APIs.  You have the ability to query and/or delete records in the tables.  Note that the tables themselves cannot be deleted.  These tables can be read from like any other user created Windows Azure Table. The REST URI for the metrics table is https://<accountname>.table.core.windows.net/Tables(“$MetricsTransactionsBlob”) to access the Blob Transaction table and https://<accountname>.table.core.windows.net/$MetricsTransactionsBlob() to access rows stored in them.  To filter the data you can use the $filter=<query expression> extension of the Query Entities API.

The metrics are stored on an hour granularity.  The time key represents the starting hour in which the requests were executed.   For example, the metrics with a key of 1:00, represent the requests that started between 1:00 and 2:00.  The metrics data is optimized to get historical data based on a time range. 

Scenario: I want to retrieve capacity details for blob service starting from 2011/05/20 05:00a.m until 2011/05/30 05:00a.m

GET  https://sally.table.core.windows.net/$MetricsCapacityBlob()?$filter=PartitionKey ge ‘20110520T0500’and PartitionKey le ‘20110530T0500’

Scenario: I want to view request details (including API details) for table service starting from 2011/05/20 05:00a.m until 2011/05/30 05:00a.m only for requests made to user data

GET https://sally.table.core.windows.net/$MetricsTransactionsTable()?$filter= PartitionKey ge ‘20110520T0500’and PartitionKey le ‘20110530T0500’and RowKey ge ‘user;’ and RowKey lt ‘user<’

To collect trending data you can view historical information (up to the point of your retention policy) which gives you insights into usage, capacity, and other trends.

What charges occur due to metrics?

The billable operations listed below are charged at the same rates applicable to all Windows Azure Storage operations. For more information on how these transactions are billed, see Understanding Windows Azure Storage Billing - Bandwidth, Transactions, and Capacity.

The following actions performed by Windows Azure Storage are billable:

  • Write requests to create table entities for metrics

Read and delete requests by the application/client to metrics data are also billable. If you have configured a data retention policy, you are not charged when Windows Azure Storage deletes old metrics data. However, if you delete analytics data, your account is charged for the delete operations.

The capacity used by $metrics tables are billable.

The following can be used to estimate the amount of capacity used for storing metrics data:

  • If a service each hour utilizes every API in every service, then approximately 148KB of data may be stored every hour in the metrics transaction tables if both service and API level summary are enabled.
  • If a service each hour utilizes every API in every service, then approximately 12KB of data may be stored every hour in the metrics transaction tables if just service level summary is enabled.
  • The capacity table for blobs will have 2 rows each day (provided user has opted in for logs) and that implies that every day approximately up to 300 bytes may be added.

Turning Metrics On

A REST API call, as shown below, is used to turn Analytics on for Metrics. In this example, logging is turned on for deletes and writes, but not for reads. The retention policy is enabled and set to ten days - so the analytics service will take care of deleting data older than ten days for you at no additional cost.

 Request: 
PUT https://sally.blob.core.windows.net/?restype=service&comp=properties 
HTTP/1.1
x-ms-version: 2009-09-19
x-ms-date: Fri, 25 Mar 2011 23:13:08 GMT
Authorization: SharedKey sally:zvfPm5cm8CiVP2VA9oJGYpHAPKxQb1HD44IWmubot0A=
Host: sally.blob.core.windows.net

<?xml version="1.0" encoding="utf-8"?>
<StorageServiceProperties>

    <Logging>
        <Version>1.0</Version>
        <Delete>true </Delete>
        <Read> false</Read>
        <Write>true </Write>
        <RetentionPolicy>
        <Enabled>true</Enabled>           
        <Days>7</Days>
    </RetentionPolicy>
    </Logging>

    <Metrics>
        <Version>1.0</Version>
        <Enabled>true</Enabled>           
        <IncludeAPIs>true</IncludeAPIs>
        <RetentionPolicy>
            <Enabled>true</Enabled>           
            <Days>10</Days>
        </RetentionPolicy>
    </Metrics>

</StorageServiceProperties >

Response:
HTTP/1.1 202 
Content-Length: 0
Server: Windows-Azure-Metrics/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: 4d02712a-a68a-4763-8f9b-c2526625be68
x-ms-version: 2009-09-19
Date: Fri, 25 Mar 2011 23:13:08 GMT

The logging and metrics sections allow you to configure what you want to track in your analytics logs and metrics data.  The metrics configuration values are described here:

  • Version - The version of Analytics Logging used to record the entry.
  • Enabled (Boolean) – set to true if you want track metrics data via analytics
  • IncludedAPIs (Boolean) – set to true if you want to track metrics for the individual APIs accessed
  • Retention policy – this is where you set the retention policy to help you manage the size of your analytics data
    • Enabled (Boolean) – set to true if you want to enable a retention policy. We recommend that you do this.
    • Days (int) – the number of days you want to keep your analytics logging data. This can be a max of 365 days and a min of 1 day

For more information, please see the MSDN Documentation. (this link will be live later today)

To turn on analytics, here are extensions to StorageClient library’s CloudBlobClient, CloudTableClient and CloudQueueClient. The extension methods and utility methods that accomplish this is just sample code in which error handling has been removed for brevity.

Let us start by listing a sample code that uses the extension samples:

 AnalyticsSettings settings = new AnalyticsSettings()
    {
        LogType = LoggingLevel.Delete | LoggingLevel.Read | LoggingLevel.Write,
        IsLogRetentionPolicyEnabled = true,
        LogRetentionInDays = 1,
        IsMetricsRetentionPolicyEnabled = true,
        MetricsRetentionInDays = 7,
        MetricsType = MetricsType.All
    };

// set the settings for each service blobClient.SetServiceSettings(settings);
queueClient.SetServiceSettings(account.QueueEndpoint, settings);
tableClient.SetServiceSettings(settings);

// get the settings from each service AnalyticsSettings blobSettings = blobClient.GetServiceSettings();
AnalyticsSettings tableSettings = tableClient.GetServiceSettings();
AnalyticsSettings queueSettings = queueClient.GetServiceSettings(account.QueueEndpoint);

We have added here a new self-explanatory settings class called AnalyticsSettings to contain the settings that can be set / retrieved. Each property listed in settings above has a property representing it.

 using System;

namespace AnalyticsSamples
{

    [Flags]
    public enum LoggingLevel {
        None = 0,
        Delete = 2,
        Write = 4,
        Read = 8,
    }

    [Flags]
    public enum MetricsType {
        None = 0x0,
        ServiceSummary = 0x1,
        ApiSummary = 0x2,
        All = ServiceSummary | ApiSummary,
    }

    /// <summary> /// The analytic settings that can set/get /// </summary> public class AnalyticsSettings {
        public static string Version = "1.0";

        public AnalyticsSettings()
        {
            this.LogType = LoggingLevel.None;
            this.LogVersion = AnalyticsSettings.Version;
            this.IsLogRetentionPolicyEnabled = false;
            this.LogRetentionInDays = 0;

            this.MetricsType = MetricsType.None;
            this.MetricsVersion = AnalyticsSettings.Version;
            this.IsMetricsRetentionPolicyEnabled = false;
            this.MetricsRetentionInDays = 0;
        }

        /// <summary> /// The type of logs subscribed for /// </summary> public LoggingLevel LogType { get; set; }

        /// <summary> /// The version of the logs /// </summary> public string LogVersion { get; set; }

        /// <summary> /// Flag indicating if retention policy is set for logs in $logs /// </summary> public bool IsLogRetentionPolicyEnabled { get; set; }

        /// <summary> /// The number of days to retain logs for under $logs container /// </summary> public int LogRetentionInDays { get; set; }

        /// <summary> /// The metrics version /// </summary> public string MetricsVersion { get; set; }

        /// <summary> /// A flag indicating if retention policy is enabled for metrics /// </summary> public bool IsMetricsRetentionPolicyEnabled { get; set; }

        /// <summary> /// The number of days to retain metrics data /// </summary> public int MetricsRetentionInDays { get; set; }

        private MetricsType metricsType = MetricsType.None;

        /// <summary> /// The type of metrics subscribed for /// </summary> public MetricsType MetricsType
        {
            get {
                return metricsType;
            }

            set {
                if (value == MetricsType.ApiSummary)
                {
                    throw new ArgumentException("Including just ApiSummary is invalid.");
                }

                this.metricsType = value;
            }
        }
    }
}

Now that we have covered the basic class, let us go over the extension class that provides the ability to set/get settings. This class provides extension methods SetServicesettings and GetServiceSettings on each one of the client objects. The rest is self-explanatory as the extension method takes the settings and then calls a single method to dispatch the settings by serializing/deserializing the settings class.

NOTE: Because CloudQueueClient does not expose the BaseUri property, the extension takes the base Uri explicitly.

 using System;
using System.Text;
using Microsoft.WindowsAzure.StorageClient;
using System.Globalization;
using System.Net;
using Microsoft.WindowsAzure;
using System.IO;
using System.Xml;

namespace AnalyticsSamples
{
    public static class AnalyticsSettingsExtension {
        static string RequestIdHeaderName = "x-ms-request-id";
        static string VersionHeaderName = "x-ms-version";
        static string Sep2009Version = "2009-09-19";
        static TimeSpan DefaultTimeout = TimeSpan.FromSeconds(30);

        #region Analytics
        /// <summary> /// Set blob analytics settings /// </summary> /// <param name="client"></param> /// <param name="settings"></param> public static void SetServiceSettings(this CloudBlobClient client, AnalyticsSettings settings)
        {
            SetSettings(client.BaseUri, client.Credentials, settings, false /* useSharedKeyLite */);
        }

        /// <summary> /// Set queue analytics settings /// </summary> /// <param name="client"></param> /// <param name="baseUri"></param> /// <param name="settings"></param> public static void SetServiceSettings(this CloudQueueClient client, Uri baseUri, AnalyticsSettings settings)
        {
            SetSettings(baseUri, client.Credentials, settings, false /* useSharedKeyLite */);
        }

        /// <summary> /// Set blob analytics settings /// </summary> /// <param name="client"></param> /// <param name="settings"></param> public static void SetServiceSettings(this CloudTableClient client, AnalyticsSettings settings)
        {
            SetSettings(client.BaseUri, client.Credentials, settings, true /* useSharedKeyLite */);
        }

        /// <summary> /// Set analytics settings /// </summary> /// <param name="baseUri"></param> /// <param name="credentials"></param> /// <param name="settings"></param> /// <param name="useSharedKeyLite"></param> public static void SetSettings(Uri baseUri, StorageCredentials credentials, AnalyticsSettings settings, bool useSharedKeyLite)
        {
            UriBuilder builder = new UriBuilder(baseUri);
            builder.Query = string.Format(
                CultureInfo.InvariantCulture,
                "comp=properties&restype=service&timeout={0}", 
                DefaultTimeout.TotalSeconds);

            HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(builder.Uri);
            request.Headers.Add(VersionHeaderName, Sep2009Version);
            request.Method = "PUT";

            StorageCredentialsAccountAndKey accountAndKey = credentials as StorageCredentialsAccountAndKey;
            using (MemoryStream buffer = new MemoryStream())
            {
                XmlTextWriter writer = new XmlTextWriter(buffer, Encoding.UTF8);
                SettingsSerializerHelper.SerializeAnalyticsSettings(writer, settings);
                writer.Flush();
                buffer.Seek(0, SeekOrigin.Begin);
                request.ContentLength = buffer.Length;

                if (useSharedKeyLite)
                {
                    credentials.SignRequestLite(request);
                }
                else {
                    credentials.SignRequest(request);
                }

                using (Stream stream = request.GetRequestStream())
                {
                    stream.Write(buffer.GetBuffer(), 0, (int)buffer.Length);
                }

                try {
                    using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
                    {
                        Console.WriteLine("Response Request Id = {0} Status={1}", response.Headers[RequestIdHeaderName], response.StatusCode);
                        if (HttpStatusCode.Accepted != response.StatusCode)
                        {
                            throw new Exception("Request failed with incorrect response status.");
                        }
                    }
                }
                catch (WebException e)
                {
                    Console.WriteLine(
                        "Response Request Id={0} Status={1}",
                        e.Response != null ? e.Response.Headers[RequestIdHeaderName] : "Response is null",
                        e.Status);
                    throw;
                }

            }
        }

        /// <summary> /// Get blob analytics settings /// </summary> /// <param name="client"></param> /// <returns></returns> public static AnalyticsSettings GetServiceSettings(this CloudBlobClient client)
        {
            return GetSettings(client.BaseUri, client.Credentials, false /* useSharedKeyLite */);
        }

        /// <summary> /// Get queue analytics settings /// </summary> /// <param name="client"></param> /// <param name="baseUri"></param> /// <returns></returns> public static AnalyticsSettings GetServiceSettings(this CloudQueueClient client, Uri baseUri)
        {
            return GetSettings(baseUri, client.Credentials, false /* useSharedKeyLite */);
        }

        /// <summary> /// Get table analytics settings /// </summary> /// <param name="client"></param> /// <returns></returns> public static AnalyticsSettings GetServiceSettings(this CloudTableClient client)
        {
            return GetSettings(client.BaseUri, client.Credentials, true /* useSharedKeyLite */);
        }

        /// <summary> /// Get analytics settings /// </summary> /// <param name="baseUri"></param> /// <param name="credentials"></param> /// <param name="useSharedKeyLite"></param> /// <returns></returns> public static AnalyticsSettings GetSettings(Uri baseUri, StorageCredentials credentials, bool useSharedKeyLite)
        {
            UriBuilder builder = new UriBuilder(baseUri);
            builder.Query = string.Format(
                CultureInfo.InvariantCulture,
                "comp=properties&restype=service&timeout={0}",
                DefaultTimeout.TotalSeconds);

            HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(builder.Uri);
            request.Headers.Add(VersionHeaderName, Sep2009Version);
            request.Method = "GET";

            StorageCredentialsAccountAndKey accountAndKey = credentials as StorageCredentialsAccountAndKey;

            if (useSharedKeyLite)
            {
                credentials.SignRequestLite(request);
            }
            else {
                credentials.SignRequest(request);
            }

            try {
                using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
                {
                    Console.WriteLine("Response Request Id={0} Status={1}", response.Headers[RequestIdHeaderName], response.StatusCode);

                    if (HttpStatusCode.OK != response.StatusCode)
                    {
                        throw new Exception("expected HttpStatusCode.OK");
                    }

                    using (Stream stream = response.GetResponseStream())
                    {
                        using (StreamReader streamReader = new StreamReader(stream))
                        {
                            string responseString = streamReader.ReadToEnd();
                            Console.WriteLine(responseString);

                            XmlReader reader = XmlReader.Create(new MemoryStream(ASCIIEncoding.UTF8.GetBytes(responseString)));
                            return SettingsSerializerHelper.DeserializeAnalyticsSettings(reader);
                        }
                    }
                }
            }
            catch (WebException e)
            {
                Console.WriteLine(
                    "Response Request Id={0} Status={1}",
                    e.Response != null ? e.Response.Headers[RequestIdHeaderName] : "Response is null",
                    e.Status);
                throw;
            }
        }
        #endregion }
}

Now to the crux of the code which handles serialization/deserialization. This code provides a SerializeAnalyticsSettings method that serializes AnalyticsSettings class into the format expected by the service and provides DeserializeAnalyticsSettings to reconstruct the AnalyticsSettings class from the response for GET REST method.

 using System.Xml;

namespace AnalyticsSamples
{
    public static class SettingsSerializerHelper {
        private const string RootPropertiesElementName = "StorageServiceProperties";
        private const string VersionElementName = "Version";
        private const string RetentionPolicyElementName = "RetentionPolicy";
        private const string RetentionPolicyEnabledElementName = "Enabled";
        private const string RetentionPolicyDaysElementName = "Days";

        private const string LoggingElementName = "Logging";
        private const string ApiTypeDeleteElementName = "Delete";
        private const string ApiTypeReadElementName = "Read";
        private const string ApiTypeWriteElementName = "Write";

        private const string MetricsElementName = "Metrics";
        private const string IncludeApiSummaryElementName = "IncludeAPIs";
        private const string MetricsEnabledElementName = "Enabled";

        private const int MaximumRetentionDays = 365;

        /// <summary> /// Reads the settings provided from stream /// </summary> /// <param name="xmlReader"></param> /// <returns></returns> public static AnalyticsSettings DeserializeAnalyticsSettings(XmlReader xmlReader)
        {
            // Read the root and check if it is empty or invalid xmlReader.Read();
            xmlReader.ReadStartElement(SettingsSerializerHelper.RootPropertiesElementName);

            AnalyticsSettings settings = new AnalyticsSettings();

            while (true)
            {
                if (xmlReader.IsStartElement(SettingsSerializerHelper.LoggingElementName))
                {
                    DeserializeLoggingElement(xmlReader, settings);
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.MetricsElementName))
                {
                    DeserializeMetricsElement(xmlReader, settings);
                }
                else {
                    break;
                }
            }

            xmlReader.ReadEndElement();

            return settings;
        }


        /// <summary> /// Write the settings provided to stream /// </summary> /// <param name="inputStream"></param> /// <returns></returns> public static void SerializeAnalyticsSettings(XmlWriter xmlWriter, AnalyticsSettings settings)
        {
            xmlWriter.WriteStartDocument();
            xmlWriter.WriteStartElement(SettingsSerializerHelper.RootPropertiesElementName);

            //LOGGING STARTS HERE xmlWriter.WriteStartElement(SettingsSerializerHelper.LoggingElementName);

            xmlWriter.WriteStartElement(SettingsSerializerHelper.VersionElementName);
            xmlWriter.WriteValue(settings.LogVersion);
            xmlWriter.WriteEndElement();

            bool isReadEnabled = (settings.LogType & LoggingLevel.Read) != LoggingLevel.None;
            xmlWriter.WriteStartElement(SettingsSerializerHelper.ApiTypeReadElementName);
            xmlWriter.WriteValue(isReadEnabled);
            xmlWriter.WriteEndElement();

            bool isWriteEnabled = (settings.LogType & LoggingLevel.Write) != LoggingLevel.None;
            xmlWriter.WriteStartElement(SettingsSerializerHelper.ApiTypeWriteElementName);
            xmlWriter.WriteValue(isWriteEnabled);
            xmlWriter.WriteEndElement();

            bool isDeleteEnabled = (settings.LogType & LoggingLevel.Delete) != LoggingLevel.None;
            xmlWriter.WriteStartElement(SettingsSerializerHelper.ApiTypeDeleteElementName);
            xmlWriter.WriteValue(isDeleteEnabled);
            xmlWriter.WriteEndElement();

            SerializeRetentionPolicy(xmlWriter, settings.IsLogRetentionPolicyEnabled, settings.LogRetentionInDays);
            xmlWriter.WriteEndElement(); // logging element //METRICS STARTS HERE xmlWriter.WriteStartElement(SettingsSerializerHelper.MetricsElementName);

            xmlWriter.WriteStartElement(SettingsSerializerHelper.VersionElementName);
            xmlWriter.WriteValue(settings.MetricsVersion);
            xmlWriter.WriteEndElement();

            bool isServiceSummaryEnabled = (settings.MetricsType & MetricsType.ServiceSummary) != MetricsType.None;
            xmlWriter.WriteStartElement(SettingsSerializerHelper.MetricsEnabledElementName);
            xmlWriter.WriteValue(isServiceSummaryEnabled);
            xmlWriter.WriteEndElement();

            if (isServiceSummaryEnabled)
            {
                bool isApiSummaryEnabled = (settings.MetricsType & MetricsType.ApiSummary) != MetricsType.None;
                xmlWriter.WriteStartElement(SettingsSerializerHelper.IncludeApiSummaryElementName);
                xmlWriter.WriteValue(isApiSummaryEnabled);
                xmlWriter.WriteEndElement();
            }

            SerializeRetentionPolicy(
                xmlWriter,
                settings.IsMetricsRetentionPolicyEnabled,
                settings.MetricsRetentionInDays);
            xmlWriter.WriteEndElement(); // metrics xmlWriter.WriteEndElement(); // root element xmlWriter.WriteEndDocument();
        }

        private static void SerializeRetentionPolicy(XmlWriter xmlWriter, bool isRetentionEnabled, int days)
        {
            xmlWriter.WriteStartElement(SettingsSerializerHelper.RetentionPolicyElementName);

            xmlWriter.WriteStartElement(SettingsSerializerHelper.RetentionPolicyEnabledElementName);
            xmlWriter.WriteValue(isRetentionEnabled);
            xmlWriter.WriteEndElement();

            if (isRetentionEnabled)
            {
                xmlWriter.WriteStartElement(SettingsSerializerHelper.RetentionPolicyDaysElementName);
                xmlWriter.WriteValue(days);
                xmlWriter.WriteEndElement();
            }

            xmlWriter.WriteEndElement(); // Retention policy for logs }

        /// <summary> /// Reads the logging element and fills in the values in Analyticssettings instance /// </summary> /// <param name="xmlReader"></param> /// <param name="settings"></param> private static void DeserializeLoggingElement(
            XmlReader xmlReader,
            AnalyticsSettings settings)
        {
            // Read logging element xmlReader.ReadStartElement(SettingsSerializerHelper.LoggingElementName);

            while (true)
            {
                if (xmlReader.IsStartElement(SettingsSerializerHelper.VersionElementName))
                {
                    settings.LogVersion = xmlReader.ReadElementString(SettingsSerializerHelper.VersionElementName);
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.ApiTypeReadElementName))
                {
                    if (DeserializeBooleanElementValue(
                        xmlReader,
                        SettingsSerializerHelper.ApiTypeReadElementName))
                    {
                        settings.LogType = settings.LogType | LoggingLevel.Read;
                    }
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.ApiTypeWriteElementName))
                {
                    if (DeserializeBooleanElementValue(
                        xmlReader,
                        SettingsSerializerHelper.ApiTypeWriteElementName))
                    {
                        settings.LogType = settings.LogType | LoggingLevel.Write;
                    }
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.ApiTypeDeleteElementName))
                {
                    if (DeserializeBooleanElementValue(
                        xmlReader,
                        SettingsSerializerHelper.ApiTypeDeleteElementName))
                    {
                        settings.LogType = settings.LogType | LoggingLevel.Delete;
                    }
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.RetentionPolicyElementName))
                {
                    // read retention policy for logging bool isRetentionEnabled = false;
                    int retentionDays = 0;
                    DeserializeRetentionPolicy(xmlReader, ref isRetentionEnabled, ref retentionDays);
                    settings.IsLogRetentionPolicyEnabled = isRetentionEnabled;
                    settings.LogRetentionInDays = retentionDays;
                }
                else {
                    break;
                }
            }

            xmlReader.ReadEndElement();// end Logging element }

        /// <summary> /// Reads the metrics element and fills in the values in Analyticssettings instance /// </summary> /// <param name="xmlReader"></param> /// <param name="settings"></param> private static void DeserializeMetricsElement(
            XmlReader xmlReader,
            AnalyticsSettings settings)
        {
            bool includeAPIs = false;

            // read the next element - it should be metrics. xmlReader.ReadStartElement(SettingsSerializerHelper.MetricsElementName);

            while (true)
            {
                if (xmlReader.IsStartElement(SettingsSerializerHelper.VersionElementName))
                {
                    settings.MetricsVersion = xmlReader.ReadElementString(SettingsSerializerHelper.VersionElementName);
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.MetricsEnabledElementName))
                {
                    if (DeserializeBooleanElementValue(
                        xmlReader,
                        SettingsSerializerHelper.MetricsEnabledElementName))
                    {
                        // only if metrics is enabled will we read include API settings.MetricsType = settings.MetricsType | MetricsType.ServiceSummary;
                    }
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.IncludeApiSummaryElementName))
                {
                    if (DeserializeBooleanElementValue(
                        xmlReader,
                        SettingsSerializerHelper.IncludeApiSummaryElementName))
                    {
                        includeAPIs = true;
                    }
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.RetentionPolicyElementName))
                {
                    // read retention policy for metrics bool isRetentionEnabled = false;
                    int retentionDays = 0;
                    DeserializeRetentionPolicy(xmlReader, ref isRetentionEnabled, ref retentionDays);
                    settings.IsMetricsRetentionPolicyEnabled = isRetentionEnabled;
                    settings.MetricsRetentionInDays = retentionDays;
                }
                else {
                    break;
                }
            }

            if ((settings.MetricsType & MetricsType.ServiceSummary) != MetricsType.None)
            {
                // If Metrics is enabled, IncludeAPIs must be included. if (includeAPIs)
                {
                    settings.MetricsType = settings.MetricsType | MetricsType.ApiSummary;
                }
            }

            xmlReader.ReadEndElement();// end metrics element }


        /// <summary> /// Reads the retention policy in logging and metrics elements /// and fills in the values in Analyticssettings instance. /// </summary> /// <param name="xmlReader"></param> /// <param name="isRetentionEnabled"></param> /// <param name="retentionDays"></param> private static void DeserializeRetentionPolicy(
            XmlReader xmlReader,
            ref bool isRetentionEnabled,
            ref int retentionDays)
        {
            xmlReader.ReadStartElement(SettingsSerializerHelper.RetentionPolicyElementName);

            while (true)
            {
                if (xmlReader.IsStartElement(SettingsSerializerHelper.RetentionPolicyEnabledElementName))
                {
                    isRetentionEnabled = DeserializeBooleanElementValue(
                        xmlReader,
                        SettingsSerializerHelper.RetentionPolicyEnabledElementName);
                }
                else if (xmlReader.IsStartElement(SettingsSerializerHelper.RetentionPolicyDaysElementName))
                {
                    string intValue = xmlReader.ReadElementString(
                        SettingsSerializerHelper.RetentionPolicyDaysElementName);
                    retentionDays = int.Parse(intValue);
                }
                else {
                    break;
                }
            }

            xmlReader.ReadEndElement(); // end reading retention policy }

        /// <summary> /// Read a boolean value for xml element /// </summary> /// <param name="xmlReader"></param> /// <param name="elementToRead"></param> /// <returns></returns> private static bool DeserializeBooleanElementValue(
            XmlReader xmlReader,
            string elementToRead)
        {
            string boolValue = xmlReader.ReadElementString(elementToRead);
            return bool.Parse(boolValue);
        }
    }
}

 

Download Metrics Data

Since listing normal tables does not list out metrics tables, existing tools will not be able to display these tables. In absence of existing tools, we wanted to provide a quick reference application with source code to make this data accessible.

The following application takes the service to download the capacity/request aggregates for, the start time and end time and a file to export to. It then exports all metric entities from the selected table to the file in a csv format. This csv format can then be consumed by say excel to study various trends on availability, errors seen by application, latency etc.

For example the following command will download all the ($MetricsTransactionsBlob) table entities for blob service between the provided time range into a file called MyMetrics.txt

DumpMetrics blob requests .\MyMetrics.txt  “2011-07-26T22:00Z” “2011-07-26T23:30Z”

 const string ConnectionStringKey = "ConnectionString";

static void Main(string[] args)
{
    if (args.Length < 4 || args.Length > 5)
    {
        Console.WriteLine("Usage: DumpMetrics <service to search - blob|table|queue> <capacity|requests> <file name to export to> <Start time in UTC for report> <Optional End time in UTC for report>.");
        Console.WriteLine("Example: DumpMetrics blob capacity test.txt \"2011-06-26T20:30Z\" \"2011-06-28T22:00Z\"");
        return;
    }

    string connectionString = ConfigurationManager.AppSettings[ConnectionStringKey];

    CloudStorageAccount account = CloudStorageAccount.Parse(connectionString);

    CloudTableClient tableClient = account.CreateCloudTableClient();

    DateTime startTimeOfSearch = DateTime.Parse(args[3]);
    DateTime endTimeOfSearch = DateTime.UtcNow;

    if (args.Length == 5)
    {
        endTimeOfSearch = DateTime.Parse(args[4]);
    }

    if (string.Equals(args[1], "requests", StringComparison.OrdinalIgnoreCase))
    {
        DumpTransactionsMetrics(tableClient, args[0], startTimeOfSearch.ToUniversalTime(), endTimeOfSearch.ToUniversalTime(), args[2]);
    }
    else if (string.Equals(args[1], "capacity", StringComparison.OrdinalIgnoreCase) && string.Equals(args[0], "Blob", StringComparison.OrdinalIgnoreCase))
    {
        DumpCapacityMetrics(tableClient, args[0], startTimeOfSearch, endTimeOfSearch, args[2]);
    }
    else {
        Console.WriteLine("Invalid metrics type. Please provide Requests or Capacity. Capacity is available only for blob service");
    }
}


/// <summary> /// Given a service, start time, end time search for, this method retrieves the metrics rows for requests and exports it to CSV file /// </summary> /// <param name="tableClient"></param> /// <param name="serviceName">The name of the service interested in</param> /// <param name="startTimeForSearch">Start time for reporting</param> /// <param name="endTimeForSearch">End time for reporting</param> /// <param name="fileName">The file to write the report to</param> static void DumpTransactionsMetrics(CloudTableClient tableClient, string serviceName, DateTime startTimeForSearch, DateTime endTimeForSearch, string fileName)
{
    string startingPK = startTimeForSearch.ToString("yyyyMMddTHH00");
    string endingPK = endTimeForSearch.ToString("yyyyMMddTHH00");

    // table names are case insensitive string tableName = string.Format("$MetricsTransactions{0}", serviceName);

    Console.WriteLine("Querying table '{0}' for PartitionKey >= '{1}' and PartitionKey <= '{2}'", tableName, startingPK, endingPK);

    TableServiceContext context = tableClient.GetDataServiceContext();

    // turn off merge option as we only want to query and not issue deletes etc. context.MergeOption = MergeOption.NoTracking;

    CloudTableQuery<MetricsTransactionsEntity> query = (from entity in context.CreateQuery<MetricsTransactionsEntity>(tableName)
                                            where entity.PartitionKey.CompareTo(startingPK) >= 0
                                            && entity.PartitionKey.CompareTo(endingPK) <= 0
                                            select entity).AsTableServiceQuery<MetricsTransactionsEntity>();

    // now we have the query set. Let us iterate over all entities and store into an output file. // Also overwrite the file Console.WriteLine("Writing to '{0}'", fileName);
    using (StreamWriter writer = new StreamWriter(fileName))
    {
        // write the header writer.Write("Time, Category, Request Type, Total Ingress, Total Egress, Total Requests, Total Billable Requests,");
        writer.Write("Availability, Avg E2E Latency, Avg Server Latency, % Success, % Throttling Errors, % Timeout Errors, % Other Server Errors, % Other Client Errors, % Authorization Errors, % Network Errors, Success,");
        writer.Write("Anonymous Success, SAS Success, Throttling Error, Anonymous Throttling Error, SAS ThrottlingError, Client Timeout Error, Anonymous Client Timeout Error, SAS Client Timeout Error,");
        writer.Write("Server Timeout Error, Anonymous Server Timeout Error, SAS Server Timeout Error, Client Other Error, SAS Client Other Error, Anonymous Client Other Error,");
        writer.Write("Server Other Errors, SAS Server Other Errors, Anonymous Server Other Errors, Authorization Errors, Anonymous Authorization Error, SAS Authorization Error,");
        writer.WriteLine("Network Error, Anonymous Network Error, SAS Network Error");

        foreach (MetricsTransactionsEntity entity in query)
        {
            string[] rowKeys = entity.RowKey.Split(';');
            writer.WriteLine("{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}, {10}, {11}, {12}, {13}, {14}, {15}, {16}, {17}, {18}, {19}, {20}, {21}, {22}, {23}, {24}, {25}, {26}, {27}, {28}, {29}, {30}, {31}, {32}, {33}, {34}, {35}, {36}, {37}, {38}, {39}, {40}",
                entity.PartitionKey,
                rowKeys[0], // category - user | system rowKeys[1], // request type is the API name (and "All" for service summary rows) entity.TotalIngress,
                entity.TotalEgress,
                entity.TotalRequests,
                entity.TotalBillableRequests,
                entity.Availability,
                entity.AverageE2ELatency,
                entity.AverageServerLatency,
                entity.PercentSuccess,
                entity.PercentThrottlingError,
                entity.PercentTimeoutError,
                entity.PercentServerOtherError,
                entity.PercentClientOtherError,
                entity.PercentAuthorizationError,
                entity.PercentNetworkError,
                entity.Success,
                entity.AnonymousSuccess,
                entity.SASSuccess,
                entity.ThrottlingError,
                entity.AnonymousThrottlingError,
                entity.SASThrottlingError,
                entity.ClientTimeoutError,
                entity.AnonymousClientTimeoutError,
                entity.SASClientTimeoutError,
                entity.ServerTimeoutError,
                entity.AnonymousServerTimeoutError,
                entity.SASServerTimeoutError,
                entity.ClientOtherError,
                entity.SASClientOtherError,
                entity.AnonymousClientOtherError,
                entity.ServerOtherError,
                entity.SASServerOtherError,
                entity.AnonymousServerOtherError,
                entity.AuthorizationError,
                entity.AnonymousAuthorizationError,
                entity.SASAuthorizationError,
                entity.NetworkError,
                entity.AnonymousNetworkError,
                entity.SASNetworkError);
        }
    }
}

/// <summary> /// Given a service, start time, end time search for, this method retrieves the metrics rows for capacity and exports it to CSV file /// </summary> /// <param name="tableClient"></param> /// <param name="serviceName">The name of the service interested in</param> /// <param name="startTimeForSearch">Start time for reporting</param> /// <param name="endTimeForSearch">End time for reporting</param> /// <param name="fileName">The file to write the report to</param> static void DumpCapacityMetrics(CloudTableClient tableClient, string serviceName, DateTime startTimeForSearch, DateTime endTimeForSearch, string fileName)
{
    string startingPK = startTimeForSearch.ToString("yyyyMMddT0000");
    string endingPK = endTimeForSearch.ToString("yyyyMMddT0000");

    // table names are case insensitive string tableName = string.Format("$MetricsCapacity{0}", serviceName);

    Console.WriteLine("Querying table '{0}' for PartitionKey >= '{1}' and PartitionKey <= '{2}'", tableName, startingPK, endingPK);

    TableServiceContext context = tableClient.GetDataServiceContext();

    // turn off merge option as we only want to query and not issue deletes etc. context.MergeOption = MergeOption.NoTracking;

    CloudTableQuery<MetricsCapacityEntity> query = (from entity in context.CreateQuery<MetricsCapacityEntity>(tableName)
                                            where entity.PartitionKey.CompareTo(startingPK) >= 0
                                            && entity.PartitionKey.CompareTo(endingPK) <= 0
                                                    select entity).AsTableServiceQuery<MetricsCapacityEntity>();

    // now we have the query set. Let us iterate over all entities and store into an output file. // Also overwrite the file Console.WriteLine("Writing to '{0}'", fileName);
    using (StreamWriter writer = new StreamWriter(fileName))
    {
        // write the header writer.WriteLine("Time, Category, Capacity (bytes), Container count, Object count");

        foreach (MetricsCapacityEntity entity in query)
        {
            writer.WriteLine("{0}, {1}, {2}, {3}, {4}",
                entity.PartitionKey,
                entity.RowKey,
                entity.Capacity,
                entity.ContainerCount,
                entity.ObjectCount);
        }
    }
}

The definitions for entities used are:

 [DataServiceKey("PartitionKey", "RowKey")]
public class MetricsCapacityEntity {
    public string PartitionKey { get; set; }
    public string RowKey { get; set; }
    public long Capacity { get; set; }
    public long ContainerCount { get; set; }
    public long ObjectCount { get; set; }
}

[DataServiceKey("PartitionKey", "RowKey")]
public class MetricsTransactionsEntity {
    public string PartitionKey { get; set; }
    public string RowKey { get; set; }
    public long TotalIngress { get; set; }
    public long TotalEgress { get; set; }
    public long TotalRequests { get; set; }
    public long TotalBillableRequests { get; set; }
    public double Availability { get; set; }
    public double AverageE2ELatency { get; set; }
    public double AverageServerLatency { get; set; }
    public double PercentSuccess { get; set; }
    public double PercentThrottlingError { get; set; }
    public double PercentTimeoutError { get; set; }
    public double PercentServerOtherError { get; set; }
    public double PercentClientOtherError { get; set; }
    public double PercentAuthorizationError { get; set; }
    public double PercentNetworkError { get; set; }
    public long Success { get; set; }
    public long AnonymousSuccess { get; set; }
    public long SASSuccess { get; set; }
    public long ThrottlingError { get; set; }
    public long AnonymousThrottlingError { get; set; }
    public long SASThrottlingError { get; set; }
    public long ClientTimeoutError { get; set; }
    public long AnonymousClientTimeoutError { get; set; }
    public long SASClientTimeoutError { get; set; }
    public long ServerTimeoutError { get; set; }
    public long AnonymousServerTimeoutError { get; set; }
    public long SASServerTimeoutError { get; set; }
    public long ClientOtherError { get; set; }
    public long AnonymousClientOtherError { get; set; }
    public long SASClientOtherError { get; set; }
    public long ServerOtherError { get; set; }
    public long AnonymousServerOtherError { get; set; }
    public long SASServerOtherError { get; set; }
    public long AuthorizationError { get; set; }
    public long AnonymousAuthorizationError { get; set; }
    public long SASAuthorizationError { get; set; }
    public long NetworkError { get; set; }
    public long AnonymousNetworkError { get; set; }
    public long SASNetworkError { get; set; }
}

Case Study

Let us walk through a sample scenario how these metrics data can be used.  As a Windows Azure Storage customer, I would like to know how my service is doing and would like to see the request trend between any two time periods.

Description: The following is a console program that takes as input: service name to retrieve the data for, start time for the report, end time for the report and the file name to export the data to in csv format.

We will start with the method “ExportMetrics”. The method will use the time range provided as input arguments to create a query filter. Since the PartitionKey is of the format “YYYYMMDDTHH00” we will create the starting and ending PartitionKey filters. The table name is $MetricsTransactions appended by the service to search for. Once we have these parameters, it is as simple as creating a normal table query using the TableServiceDataContext. We use the extension AsTableServiceQuery as it takes care of continuation tokens. The other important optimization is we turn off merge tracking in which the context tracks all the entities returned in the query response. We can do this here since the query is solely used for retrieval rather than subsequent operations like Delete on these entities. The class used to represent each row in the response is MetricsEntity and its definition is given below. It is a plain simple CSharp class definition exception for the DataServcieKey required by WCF Data Services .NET and has only subset of properties that we would be interested in.

Once we have the query, all we do is to iterate over this query which results in executing the query. Behind the scenes, this CloudTableQuery instance may lazily execute multiple queries if needed to handle continuation tokens. We then write this in csv format. But one can imagine importing this into Azure table or SQL to perform more reporting like services.      

NOTE: Exception handling and parameter validation is omitted for brevity.

 /// <summary> /// Given a service name, start time, end time search for, this method retrieves the metrics rows for requests /// and exports it to CSV file /// </summary> /// <param name="tableClient"></param> /// <param name="serviceName">The name of the service interested in</param> /// <param name="startTimeForSearch">Start time for reporting</param> /// <param name="endTimeForSearch">End time for reporting</param> /// <param name="fileName">The file to write the report to</param> static void ExportMetrics(CloudTableClient tableClient, string serviceName, DateTime startTimeForSearch, DateTime endTimeForSearch, string fileName)
{
    string startingPK = startTimeForSearch.ToString("yyyyMMddTHH00");
    string endingPK = endTimeForSearch.ToString("yyyyMMddTHH00");

    // table names are case insensitive string tableName = "$MetricsTransactions"+ serviceName;

    Console.WriteLine("Querying table '{0}' for PartitionKey >= '{1}' and PartitionKey <= '{2}'", tableName, startingPK, endingPK);

    TableServiceContext context = tableClient.GetDataServiceContext();

    // turn off merge option as we only want to query and not issue deletes etc. context.MergeOption = MergeOption.NoTracking;

    CloudTableQuery<MetricsEntity> query = (from entity in context.CreateQuery<MetricsEntity>(tableName)
                                            where entity.PartitionKey.CompareTo(startingPK) >= 0
                                            && entity.PartitionKey.CompareTo(endingPK) <= 0
                                            select entity).AsTableServiceQuery<MetricsEntity>();

    // now we have the query set. Let us iterate over all entities and store into an output file. // Also overwrite the file using (Stream stream= new FileStream(fileName, FileMode.Create, FileAccess.ReadWrite))
    {
        using (StreamWriter writer = new StreamWriter(stream))
        {
            // write the header writer.WriteLine("Time, Category, Request Type, Total Ingress, Total Egress, Total Requests, Total Billable Requests, Availability, Avg E2E Latency, Avg Server Latency, % Success, % Throttling, % Timeout, % Misc. Server Errors, % Misc. Client Errors, % Authorization Errors, % Network Errors");

            foreach (MetricsEntity entity in query)
            {
                string[] rowKeys = entity.RowKey.Split(';');
                writer.WriteLine("{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}, {10}, {11}, {12}, {13}, {14}, {15}, {16}",
                    entity.PartitionKey,
                    rowKeys[0], // category - user | system rowKeys[1], // request type is the API name (and "All" for service summary rows) entity.TotalIngress,
                    entity.TotalEgress,
                    entity.TotalRequests,
                    entity.TotalBillableRequests,
                    entity.Availability,
                    entity.AverageE2ELatency,
                    entity.AverageServerLatency,
                    entity.PercentSuccess,
                    entity.PercentThrottlingError,
                    entity.PercentTimeoutError,
                    entity.PercentServerOtherError,
                    entity.PercentClientOtherError,
                    entity.PercentAuthorizationError,
                    entity.PercentNetworkError);
            }
        }
    }
}

 [DataServiceKey("PartitionKey", "RowKey")]
public class MetricsEntity
{
    public string PartitionKey { get; set; }
    public string RowKey { get; set; }
    public long TotalIngress { get; set; }
    public long TotalEgress { get; set; }
    public long TotalRequests { get; set; }
    public long TotalBillableRequests { get; set; }
    public double Availability { get; set; }
    public double AverageE2ELatency { get; set; }
    public double AverageServerLatency { get; set; }
    public double PercentSuccess { get; set; }
    public double PercentThrottlingError { get; set; }
    public double PercentTimeoutError { get; set; }
    public double PercentServerOtherError { get; set; }
    public double PercentClientOtherError { get; set; }
    public double PercentAuthorizationError { get; set; }
    public double PercentNetworkError { get; set; }
}

The main method is simple enough to parse the input.

 static void Main(string[] args)
{
    if (args.Length < 4)
    {
        Console.WriteLine("Usage: MetricsReporter <service to search - blob|table|queue> <Start time in UTC for report> <End time in UTC for report> <file name to export to>.");
        Console.WriteLine("Example: MetricsReporter blob \"2011-06-26T20:30Z\" \"2011-06-28T22:00Z\"");
        return;
    }

    CloudStorageAccount account = CloudStorageAccount.Parse(ConnectionString);
    CloudTableClient tableClient = account.CreateCloudTableClient();

    DateTime startTimeOfSearch = DateTime.Parse(args[1]).ToUniversalTime();
    DateTime endTimeOfSearch = DateTime.Parse(args[2]).ToUniversalTime();

    //ListTableRows(tableClient, timeOfRequest); ExtractMetricsToReports(tableClient, args[0], startTimeOfSearch, endTimeOfSearch, args[3]);
}

Once we have the data exported as CSV files, once can use Excel to import the data and create required important charts.

For more information, please see the MSDN Documentation.

Jai Haridas, Monilee Atkinson, and Brad Calder

Comments

  • Anonymous
    August 23, 2011
    Hi, What are the chances of getting a TotalBillableEgress metric? This would be of great interest when attempting to forecast costs, similiar to TotalBillableRequests. All in all, the new Storage Analytics feature is great. See a description and link to my live implementation at oakleafblog.blogspot.com/.../oakleaf-systems-windows-azure-table.html. Thanks in advance and cheers, --rj

  • Anonymous
    August 23, 2011
    The comment has been removed

  • Anonymous
    August 15, 2012
    This is great information. How can I verify if Storage Analytics is enabled or not without turning it on or off?

  • Anonymous
    August 19, 2012
    Hi Prakash, You can verify the Storage Analytics settings by viewing the Storage Configuration page in the Portal. For specific steps and more information, please review this How To: www.windowsazure.com/.../how-to-monitor-a-storage-account

  • Anonymous
    January 30, 2014
    Is there a way to get the number of messages in a queue at a given time once monitoring/logging is turned on?  If so, how exactly?

  • Anonymous
    May 13, 2014
    The Azure account capacity is 200TB at present. If the Storage Analytics is turned on, will it be part of 200TB i.e. Data + Analytics.

  • Anonymous
    May 15, 2014
    @Lakshmi, the account limit is 500TB and it includes analytics capacity. See msdn.microsoft.com/.../dn249410.aspx for more details.