Custom query specification for commercial marketplace

Partners can use this query specification to easily formulate custom queries for extracting data from analytics tables. The queries can be used to select only the desired columns and metrics that match a certain criterion. At the heart of the language specification is the dataset definition on which a custom query can be written.

Datasets

In the same way that some queries are run against a database that has tables and columns, a custom query works on Datasets that have columns and metrics. The full list of available datasets for formulating a query can be obtained by using the datasets API.

Here's an example of a dataset shown as a JSON.

{
            "datasetName": "ISVUsage",
            "selectableColumns": [
                "MarketplaceSubscriptionId",
                "MonthStartDate",
                "OfferType",
                "AzureLicenseType",
                "MarketplaceLicenseType",
                "SKU",
                "CustomerCountry",
                "IsPreviewSKU",
                "SKUBillingType",
                "VMSize",
                "CloudInstanceName",
                "OfferName",
                "IsPrivateOffer",
                "DeploymentMethod",
                "CustomerName",
                "CustomerCompanyName",
                "UsageDate",
                "IsNewCustomer",
                "CoreSize",
                "TrialEndDate",
                "CustomerCurrencyCC",
                "PriceCC",
                "PayoutCurrencyPC",
                "EstimatedPricePC",
                "UsageReference",
                "UsageUnit",
                "CustomerId",
                "BillingAccountId",
                "MeterDimension",
                "MeterId",
                "PartnerCenterDetectedAnomaly",
                "PublisherMarkedAnomaly",
                "NewReportedUsage",
                "ActionTakenAt",
                "ActionTakenBy",
                "PlanId",
                "ReferenceId",
                "ListPriceUSD",
                "DiscountedPriceUSD",
                "IsPrivatePlan",
                "OfferId",
                "PrivateOfferId",
                "PrivateOfferName",
                "BillingId",
                "PlanType",
                "CustomerAccess",
                "PublisherAccess",
                "CustomerAdjustmentUSD",
                "MultiParty",
                "PartnerInfo",
                "SalesNotes",
                "IsCustomMeter",
                "AssetId",
                "IsMultisolution",
                "VMSubscription"
            ],
            "availableMetrics": [
                "NormalizedUsage",
                "MeteredUsage",
                "RawUsage",
                "EstimatedExtendedChargeCC",
                "EstimatedExtendedChargePC",
                "EstimatedFinancialImpactUSD"
            ],
            "availableDateRanges": [
                "LAST_MONTH",
                "LAST_3_MONTHS",
                "LAST_6_MONTHS",
                "LAST_1_YEAR",
                "LAST_3_YEARS"
            ],
            "minimumRecurrenceInterval": 1
        }

Parts of a dataset

  • A dataset name is like a database table name. For example, ISVUsage. A dataset has a list of columns that can be selected, such as MarketplaceSubscriptionId.
  • A dataset also has metrics, which are like aggregation functions in a database. For example, NormalizedUsage.
  • There are fixed time spans over which data can be exported.

Formulating a query on a dataset

These are some sample queries that show how to extract various types of data.

Query Description
SELECT MarketplaceSubscriptionId,CustomerId FROM ISVUsage TIMESPAN LAST_MONTH This query gets every MarketplaceSubscriptionId and its corresponding CustomerId in the last one month.
SELECT MarketplaceSubscriptionId, EstimatedExtendedChargeCC FROM ISVUsage ORDER BY EstimatedExtendedChargeCC LIMIT 10 This query gets the top 10 subscriptions in decreasing order of the number of licenses sold under each subscription.
SELECT CustomerId, NormalizedUsage, RawUsage FROM ISVUsage WHERE NormalizedUsage > 100000 ORDER BY NormalizedUsage TIMESPAN LAST_6_MONTHS This query gets the NormalizedUsage and RawUsage of all the Customers who have NormalizedUsage greater than 100,000.
SELECT MarketplaceSubscriptionId, MonthStartDate, NormalizedUsage FROM ISVUsage WHERE CustomerId IN ('aaaabbbb-0000-cccc-1111-dddd2222eeee', 'bbbbcccc-1111-dddd-2222-eeee3333ffff') This query gets the MarketplaceSubscriptionId and the normalized usage for every month by the two CustomerId values: aaaabbbb-0000-cccc-1111-dddd2222eeee and bbbbcccc-1111-dddd-2222-eeee3333ffff.

Query specification

This section describes the query definition and structure.

Grammar reference

This table describes the symbols used in queries.

Symbol Meaning
? Optional
* Zero or more
+ One or more
| Or/One of the list

Query definition

The query statement has the following clauses: SelectClause, FromClause, WhereClause?, OrderClause?, LimitClause?, and TimeSpan?.

  • SelectClause: SELECT ColumOrMetricName (, ColumOrMetricName)*
    • ColumOrMetricName: Columns and Metrics defined within the Dataset
  • FromClause: FROM DatasetName
    • DatasetName: Dataset name defined within the Dataset
  • WhereClause: WHERE FilterCondition (AND FilterCondition)*
    • FilterCondition: ColumOrMetricName Operator Value
      • Operator: = | > | < | >= | <= | != | LIKE | NOT LIKE | IN | NOT IN
      • Value: Number | StringLiteral | MultiNumberList | MultiStringList
        • Number: -? [0-9]+ (. [0-9] [0-9]*)?
        • StringLiteral: ' [a-zA-Z0-9_]*'
        • MultiNumberList: (Number (,Number)*)
        • MultiStringList: (StringLiteral (,StringLiteral)*)
  • OrderClause: ORDER BY OrderCondition (,OrderCondition)*
    • OrderCondition: ColumOrMetricName (ASC | DESC)*
  • LimitClause: LIMIT [0-9]+
  • TimeSpan: TIMESPAN ( TODAY | YESTERDAY | LAST_7_DAYS | LAST_14_DAYS | LAST_30_DAYS | LAST_90_DAYS | LAST_180_DAYS | LAST_365_DAYS | LAST_MONTH | LAST_3_MONTHS | LAST_6_MONTHS | LAST_1_YEAR)

Query structure

A Report query is made up of multiple parts:

  • SELECT
  • FROM
  • WHERE
  • ORDER BY
  • LIMIT
  • TIMESPAN

Each part is described below.

SELECT

This part of the query specifies the columns that get exported. The columns that can be selected are the fields listed in selectableColumns and availableMetrics sections of a dataset. If there's a metric column included in the selected field list, then metrics are calculated for every distinct combination of the nonmetric columns.

Example:

  • SELECT OfferName, NormalizedUsage

DISTINCT

Adding DISTINCT keyword after SELECT ensures the final exported data doesn't have any duplicate rows. DISTINCT keyword works irrespective of whether or not a metric column is selected.

Example:

  • SELECT DISTINCT MarketplaceSubscriptionId, OfferType

FROM

This part of the query indicates the dataset from which data needs to be exported. The dataset name given here needs to be a valid dataset name returned by the datasets API.

Example:

  • FROM ISVUsage
  • FROM ISVOrder

WHERE

This part of the query is used to specify filter conditions on the dataset. Only rows matching all the conditions listed in this clause are present in the final exported file. The filter condition can be on any of the columns listed in selectableColumns and availableMetrics. The values specified in the filter condition can be a list of numbers or a list of strings only when the operator is IN or NOT IN. The values can always be given as a literal string and they're converted to the native types of columns. Multiple filter conditions need to be separated with an AND operation.

Example:

  • MarketplaceSubscriptionId = 'aaaabbbb-0000-cccc-1111-dddd2222eeee'
  • CustomerName LIKE '%Contosso%'
  • CustomerId NOT IN (1000, 1001, 1002)
  • OrderQuantity=100
  • OrderQuantity='100'
  • MarketplaceSubscriptionId='bbbbcccc-1111-dddd-2222-eeee3333ffff' AND CustomerId='0f8b7fa0-eb83-a183-1225-ca153ef807aa'

ORDER BY

This part of the query specifies the ordering criteria for the exported rows. The columns on which ordering can be defined need to be from the selectableColumns and availableMetrics of the dataset. If there's no ordering direction specified, it is defaulted to DESC on the column. Ordering can be defined on multiple columns by separating the criteria with a comma.

Example:

  • ORDER BY NormalizedUsage ASC, EstimatedExtendedCharge(CC) DESC
  • ORDER BY CustomerName ASC, NormalizedUsage

LIMIT

This part of the query specifies the number of rows that are exported. The number you specify needs to be a positive nonzero integer.

TIMESPAN

This part of the query specifies the time duration for which the data needs to be exported. The possible values should be from the availableDateRanges field in the dataset definition.

Case sensitivity in query specification

The specification is completely case insensitive. Predefined keywords, column names and values can be specified using upper or lower case.

See also