SQL Server Analysis Services Processing
Often time I been asked this quest - what will be the impact of processadd on indexes or aggregations, do I need to recreate aggregations or Indexes after creating Process Update or what is the right approach of handling processing, so thought of writing this article.
Here are the major Process Types in SQL 2005 / SQL 2008 / SQL 2008R2 Analysis Services:
ProcessFull - Simple to digest, applies all objects which means it discards the storage contents of the object and rebuilds them. It applies to descendants objects as well.
ProcessDefault - It does the bare minimum task to bring the objects to a fully processed state which means it build only storage contents which are not currently built. For eg, if you have created new project and just deployed to AS server then ProcessDefault on Database will do full processing or take another case where Partition has fact and aggregation data, then ProcessDefault will only build bitmap indexes.
But in case if you change the binding of an object(eg, change the structure of an object, like added attribute or removed attribute), in that case ProcessDefault on the object will discard all the storage contents and rebuild them.
ProcesData - Applies to dimension, cube, measure group and partition. It discards the storage contents and rebuilds only data.
Dimension it builds Hierarchy and Attributes
Partitions it builds the fact data.
ProcessIndexes - Applies to dimension, cube, measure group, and partition. It requires that the object must already have its "data" built; otherwise, it raises an error. ProcessIndexes preserves the data and rebuilds the "indexes". For dimensions, it builds the bitmap indexes. For partitions, it builds the aggregation data and bitmap indexes. ProcessIndexes is recursively applied to all descendants of the object as well.
ProcessUpdate - Now here is the most confusing part of all. It applies only to Dimension. It is the equivalent of incremental dimension processing in Analysis Services 2000. It sends SQL queries to read the entire dimension table and applies the changes—member updates, additions, deletions.
Since ProcessUpdate reads the entire dimension table, it begs the question, "How is it different from ProcessFull?" The difference is that ProcessUpdate does not discard the dimension storage contents. It applies the changes in a "smart" manner that preserves the fact data in dependent partitions. ProcessFull, on the other hand, does an implicit ProcessClear on all dependent partitions. ProcessUpdate is inherently slower than ProcessFull since it is doing additional work to apply the changes.
Depending on the nature of the changes in the dimension table, ProcessUpdate can affect dependent partitions. If only new members were added, then the partitions are not affected. But if members were deleted or if member relationships changed (e.g., a Customer moved from Redmond to Seattle), then some of the aggregation data and bitmap indexes on the partitions are dropped. The cube is still available for queries, albeit with lower performance.(with ProcessUpdate Flexible aggregations and indexes on related partitions will be dropped)
Which means after ProcessUpdate you need to Process Indexes on the Partitions. The simple approach of dealing with such problem is to do default processing on associated Cubes, default processing will create dropped Aggregations and Bitmap indexes for affected partitions.
If you want to check Aggregations which are dropped after Process Update then follow these task (Applies only to SQL 2008)
1) Take the output (Save it for later comparison) firing this DMV
SELECT *
FROM SYSTEMRESTRICTSCHEMA($SYSTEM.DISCOVER_PARTITION_STAT
, CUBE_NAME='Adventure Works'
, DATABASE_NAME = 'AdventureWorksDW2008'
, MEASURE_GROUP_NAME = 'Sales Orders'
, PARTITION_NAME='Total_Orders_2004'
)
2) Process Update Dimension which impacts this Cube
3) Again fire the same command as specified in step 1 you will find few aggregations will be dropped.
For checking Impacted Indexes follow these steps (Applies only to SQL 2008)
1) Take the output (Save it for later comparison) by firing this DMV
SELECT * FROM SYSTEMRESTRICTSCHEMA(
$SYSTEM.DISCOVER_PARTITION_DIMENSION_STAT
, CUBE_NAME = ‘Adventure Works’c
, DATABASE_NAME = ‘Adventure Works DW’
, MEASURE_GROUP_NAME = ‘Sales Orders’
, PARTITION_NAME=’Total_Orders_2004'
)
2) Process Update Dimension which impacts this cube
3) Again fire the same command as specified in step 1 and look for Attribute_Indexed Col, do comparisons with output of Step 1 you will find differences
ProcessAdd - Applies to Dimensions and Partitions, it didn't existing in SSAS 2000. It essentially optimizes ProcessUpdate in scenarios where you are adding only new records to Dimension and facts.
Before using this Processing Type you need to guarantee that only new rows are added. ProcessAdd never deletes or updates existing members. It only adds new members. The users can restrict the dimension table so that ProcessAdd reads only the new rows.
Good News unlike ProcessUpdate - ProcessAdd doesn't drop aggregations and indexes.
This option is not available in GUI you need to fire it using only XMLA (ProcessADD)
ProcessClear - Discards the storage of the object, it applies to all objects and descendants objects as well.