SQL SSAS: Exists v.s Existing and Auto Exist in MDX
Let us Compare
In SQL SSAS, though Exists, existing and Auto-Exist are conceptually very simple, they are the root cause for much confusion in complex MDX queries. So let us make some simple comparisons.
Normal Behaviour |
Existing | |||
Theory | When two sets are cross-joined, they will produce all possible Cartesian combinations of their members. E.g., if we join two sets with 4 members each, then they will produce a new set with 16 members. |
It forces the natural hierarchy in the scenarios below. When two sets containing members of the same dimension are cross-joined, the resulting set is limited to those combinations of members that are actually observed in the dimension in question. When the WHERE clause contains a member of a dimension, sets along the axes containing members from that same dimension are limited as well. |
It forces the natural hierarchy without returning the forcing/second set. |
Auto-exist is not applicable to calculated members and if we want to force this behaviour on them, then we need the Existing keyword. |
When to use | When we want to display the values of measure(s) for the full Cartesian product of subsets of different dimensions. | When we want to display the values of measures for existing combinations of members in different levels but coming from same hierarchy. | Achieve the same auto-exists results but without displaying the second set (no Cartesian product will be shown). | Forces the local context in calculated members. |
Normal Behaviour
Run the following query.
SELECT [Measures].[Reseller Sales Amount] ON 0
, [Product].[Category].[Category].Members on 1
from [Adventure Works]
Based on the above query we have 4 Categories in the Product Categories hierarchy.
SELECT [Measures].[Reseller Sales Amount] ON 0
, [Geography].[Geography].[Country].members
on 1
from [Adventure Works]
Based on the above query we have 6 countries under the Geography hierarchy.
Let us Cross Join it
SELECT [Measures].[Reseller Sales Amount]
ON
0 ,[Product].[Category].[Category].Members * {[Geography].[Geography].[Country].Members }
on 1
from [Adventure Works] 1
So, along with the header row it has returned 25 rows, which is the direct cross-product of 4 products and 6 countries (24 data rows + 1 header row).
What other ways are there to achieve the cross join
You can use the cross-join function as shown below.
SELECT [Measures].[Reseller Sales Amount] ON 0
,crossjoin ([Product].[Category].[Category].Members
, [Geography].[Geography].[Country].Members
) on 1
from [Adventure Works]
Even if you remove the crossjoin keyword in the above query, it will still produce the same result.
SELECT [Measures].[Reseller Sales Amount] ON 0
,([Product].[Category].[Category].Members
, [Geography].[Geography].[Country].Members
) on 1
from [Adventure Works]
Auto-Exists
In the above example we have produced a cross-join between two different hierarchies from two different dimensions (Product.Category and Geography.Country). Let us do a cross-join between Category and Subcategory from the same Product hierarchy.
Let us find the count of all members in Subcategory first.
SELECT [Measures].[Reseller Sales Amount] ON 0
, [Product].[SubCategory].[SubCategory].Members on 1
from [Adventure Works]
Along with the header row we have 38 subcategories; let's cross join Subcategory and Category.
SELECT [Measures].[Reseller Sales Amount] ON 0 ,
[Product].[Category].[Category].Members * [Product].[Subcategory].[Subcategory].Members on 1
from [Adventure Works]
As you can see from the output, even though it has given the cross product, still it has the same 38 (37 rows of data + 1 header) rows in the output. The normal behaviour should have produced 4*38 (= 152) rows.
The reason for the behaviour is that, based on the Product.Category hierarchy, only valid Category and Subcategory combinations are produced. 'Valid combinations' in this context means 'combinations of members that actually exist in the dimension.'
What is Existing
Let us create a calculated member in the query below.
with member countofsubcategory as
count( [Product].[Subcategory].[Subcategory].Members)
SELECT {[Measures].[Reseller Sales Amount], countofsubcategory} ON 0
, [Product].[Category].[Category].Members *
[Product].[Subcategory].[Subcategory].Members
on 1
from [Adventure Works]
Though we are displaying each subcategory in the list, still we got the total count. This is because auto-exist doesn’t control calculated members.
By inserting the 'existing' keyword in the above query, we can produce the expected result (which is 1 per a subcategory).
with member countofsubcategory as count( existing [Product].[Subcategory].[Subcategory].Members)
SELECT {[Measures].[Reseller Sales Amount], countofsubcategory} ON 0
, [Product].[Category].[Category].Members * [Product].[Subcategory].[Subcategory].Members
on 1
from [Adventure Works]
What is the Exists function?
'Exists' uses the “auto-exist” functionality but avoids displaying the “second set” in the results.
SELECT [Measures].[Reseller Sales Amount] ON 0 ,
Exists( [Product].[Subcategory].[Subcategory].Members ,
[Product].[Category].[Clothing]
)
on 1
FROM [Adventure Works]
It only displays the subcategories which are related to the Clothing category without displaying the category itself in the result.
Usual confusions
Exists and “auto-exists” are nearly the same; the only difference is that the Exists function doesn’t display the second set.
No. Actually one sets the context and other doesn't.
SELECT [Measures].[Reseller Sales Amount] ON 0
, Exists( [Product].[Subcategory].[Subcategory].Members
,[Product].[Category].[Clothing]
)on 1
FROM [Adventure Works]
where [Product].[Category].[Bikes]
The above query will produce no results because we are slicing by two different members of the same hierarchy (Bikes & Clothing) at the same time. But the query allows you to use the same hierarchy's different members in both the row axis and the slicer (Exists doesn't set the context).
The query below will produce errors, because auto-exists does set the context.
SELECT [Measures].[Reseller Sales Amount] ON 0 , [Product].[Subcategory].[Subcategory].Members *
[Product].[Category].[Clothing]
on 1
FROM [Adventure Works]
where [Product].[Category].&[1]
http://lh6.ggpht.com/-Zlc4HeT-qd0/UmMIjLW6xzI/AAAAAAAADz8/JRn0W4BiCsE/image_thumb22_thumb.png?imgmax=800
The Existing keyword is just an extension of Exists; we can achieve similar results with Exists and the CurrentMember function.
This is true but the use of the Existing keyword will improve the maintainability of code.
Both the below queries will produce the same results.
with member countofsubcategory as count( existing [Product].[Subcategory].[Subcategory].Members)
SELECT {[Measures].[Reseller Sales Amount], countofsubcategory} ON 0
, [Product].[Category].[Category].Members
on 1
from [Adventure Works]
with member countofsubcategory as
count( exists( [Product].[Subcategory].[Subcategory].Members
,[Product].[Category].currentmember ))
SELECT {[Measures].[Reseller Sales Amount], countofsubcategory} ON 0
, [Product].[Category].[Category].Members
on 1
from [Adventure Works]
http://lh4.ggpht.com/-0GBbyKXPEaE/UmMIjyKJWdI/AAAAAAAAD0M/QaaBVcY2GpU/image_thumb1.png?imgmax=800
DO we have a Not-Exists function ?
No. But this can be simulated with the Except function. The below query produces 8 rows of data.
SELECT [Measures].[Reseller Sales Amount] ON 0,
Exists( [Product].[Subcategory].[Subcategory].Members ,
[Product].[Category].[Clothing]
)
on 1
FROM [Adventure Works]
http://lh5.ggpht.com/-PMptrEJIdHg/UmMIk8eKKtI/AAAAAAAAD0c/lMigem2bnDg/image_thumb2.png?imgmax=800
With the Except function in the query below it produces the complement set with 30 rows.
SELECT [Measures].[Reseller Sales Amount] ON 0
,EXCEPT([Product].[Subcategory].[Subcategory].Members,
Exists( [Product].[Subcategory].[Subcategory].Members ,
[Product].[Category].[Clothing]
)
)
on 1
FROM [Adventure Works]
http://lh6.ggpht.com/-oRURWMJuDw8/UmMImOGeOaI/AAAAAAAAD0s/i9KPgeLg3Qw/image_thumb24_thumb.png?imgmax=800
http://lh4.ggpht.com/-Wt-NtCigqG8/UmMInKJsUHI/AAAAAAAAD04/HrdVQVOJ4kQ/image_thumb26_thumb.png?imgmax=800
Generate function forces its own context against the Existing keyword.
Check the below query. It is clear that the Existing keyword uses the Generate function’s context and that is why we got 3 subcategories against all categories in the result set. with member countofsubcategory as
Generate ([Product].[Category].[Bikes]
, count( existing [Product].[Subcategory].[Subcategory].Members)
)
SELECT {[Measures].[Reseller Sales Amount], countofsubcategory} ON 0
, [Product].[Category].[Category].Members
on 1
from [Adventure Works]
http://lh3.ggpht.com/-Csok-VUVhMA/UmMIn3kIp1I/AAAAAAAAD1I/q7sBLlOXS5I/image_thumb28_thumb.png?imgmax=800
This is yet another proof that the Existing keyword is simply an extension of the Exists function.
Exist With Measures
Most of the examples on the Existing keyword focus on the effect of Exists on a single dimension. But what will happen if I use it against a measure?
Will that make any difference?
The Existing keyword sets the current context on a measure calculation, which is really invaluable.
Let us check the below query.
with member SalesAmtAustraliaCanada asAggregate({[Geography].[Geography].[Country].&[Australia],
[Geography].[Geography].[Country].&[Canada]
}
,[Measures].[Reseller Sales Amount]
)
select {[Measures].[Reseller Sales Amount],SalesAmtAustraliaCanada} on 0,
[Date].[Calendar].[Calendar Year].members on 1
from [Adventure Works]
http://lh3.ggpht.com/-2wWJ1iLITv4/Umb7KqwetuI/AAAAAAAAD2E/4mJi5lbjfEw/image_thumb%25255B1%25255D.png?imgmax=800
As expected, the calculated measure has summed up the sales related to Australia and Canada, which is way lower than the total amount against a year.
Let us execute the following query and get the sales against different countries.
with member SalesAmtAustraliaCanada as
Aggregate({[Geography].[Geography].[Country].&[Australia],
[Geography].[Geography].[Country].&[Canada]
}
,[Measures].[Reseller Sales Amount]
)
select {[Measures].[Reseller Sales Amount],SalesAmtAustraliaCanada} on 0,
[Geography].[Geography].[Country].members on 1
from [Adventure Works]
As you can see, it displays the same summed up value against each country. Though technically this is correct, we would ideally want to get the data only against Australia and Canada for this new measure.
So what is happening here?
Calculated members don’t get affected by the current query context. If we need the current context, then we need the Existing keyword as well.
with member SalesAmtAustraliaCanada as
Aggregate(existing {[Geography].[Geography].[Country].&[Australia],
[Geography].[Geography].[Country].&[Canada]
}
,[Measures].[Reseller Sales Amount]
)
select {[Measures].[Reseller Sales Amount],SalesAmtAustraliaCanada} on 0,
[Geography].[Geography].[Country].members on 1
from [Adventure Works]
http://lh6.ggpht.com/-GMyBJwZ9buQ/Umb7NYnWBMI/AAAAAAAAD2k/P-gkg06RGlY/image_thumb%25255B8%25255D.png?imgmax=800
What will happen if we revert the query to the Date dimension?
with member SalesAmtAustraliaCanada as
Aggregate(existing {[Geography].[Geography].[Country].&[Australia],
[Geography].[Geography].[Country].&[Canada]
}
,[Measures].[Reseller Sales Amount]
)
select {[Measures].[Reseller Sales Amount],SalesAmtAustraliaCanada} on 0,
[Date].[Calendar].[Calendar Year].members on 1
from [Adventure Works]
http://lh6.ggpht.com/-xY73YfxAdEg/Umb7OFGHO7I/AAAAAAAAD20/ynqhSwVcfyc/image_thumb%25255B10%25255D.png?imgmax=800
It produces the same result as above because in the above query the current context for the Geography dimension is “All Geographies”.
Let's confirm this with the query below.
with member SalesAmtAustraliaCanada as
Aggregate(existing {[Geography].[Geography].[Country].&[Australia],
[Geography].[Geography].[Country].&[Canada]
}
,[Measures].[Reseller Sales Amount]
)
member Geographycurrenthier as
[Geography].[Geography].membervalue
select {[Measures].[Reseller Sales Amount],SalesAmtAustraliaCanada,Geographycurrenthier} on 0,
[Date].[Calendar].[Calendar Year].members on 1
from [Adventure Works]
http://lh4.ggpht.com/-Ri-Iua-b0co/Umb7PA5R-MI/AAAAAAAAD3E/UhGRIgcqTS4/image_thumb%25255B12%25255D.png?imgmax=800
This is a very useful technique for end user reporting.
See Also