Notes on Deciding whether to implement a Tabular or Multidimensional Model
Within the Business Intelligence community, when the discussion comes to Microsoft SQL Server Analysis Services, there are a couple
of questions that typically arise. One of those questions is whether it’s possible to easily (read that as programmatically) convert from one model to
another. Typically, the other is whether to implement a solution that uses a Multidimensional model or a solution that is based on a Tabular model.
The answer to the first question is easy and relatively straight forward. Switching from one model to another essentially requires
starting from scratch. I know it’s not necessarily a pleasant thought, but there are a number of significant differences between the xVelocity engine used for
Tabular models and the OLAP engine that is used for Multidimensional models that make programmatic conversions, at least as of this writing, untenable. Consequently,
converting from one model to another essentially requires another development effort.
Because there are costs associated with development of any BI model (I’m going to include PowerPivot since it uses the same engine as
Tabular models), selection of the most appropriate model has significant financial implications to the business for which the model is being built. There
have been a number of blog posts and publications discussing the advantages and disadvantages of both Multidimensional and Tabular models. The vast majority of
those publications focus primarily on the technical decision points related to selection of the most appropriate model for project development. Greg Winch
discussed rationales for selection of models in the eGroup blog Tabular vs. Multidimensional Model: what’s right for your business.
Melissa Coates’ (aka SQL Chick) discussed features of PowerPivot, Tabular models and Multidimensional models in her blog Decisions: PowerPivot,
SSAS Tabular, or SSAS Multidimensional Model in SQL Server 2012. Liz Vitt and Scott Cameron (Hitachi Consulting) provide a nice primer that covers the
BI Semantic Model (BISM), Multidimensional modeling and Tabular Modeling (Choosing a Tabular or Multidimensional Modeling
Experience in SQL Server 2012 Analysis Services.) James Serra points out a number of “missing features” in his blog Tabular model: Not
ready for prime time?
The real answer as to which model is the most appropriate is (as the case with most things BI) “it depends.” Before delving
into a lengthy discussion of the factors that should be considered in selection of the model to be used for building a solution, it may be beneficial to engage
in a brief history of the factors that resulted in creation of PowerPivot and the Tabular model. It shouldn’t be surprising that the evolution of these
models is rooted in economics.
There are a number of costs associated with development efforts involving Multidimensional models. These include heavy time
and economic investments in data and dimensional modeling efforts, as well as costs associated with pre-aggregating data, persisting calculations, data storage
and data processing. (For what it’s worth, finding consultants with significant expertise in dimensional modeling is a whole lot easier said than done). Add to
that the fact that mistakes in dimensional models frequently result in direct costs for additional engineering efforts, solutions that don’t meet user needs
or worse yet return incorrect results resulting in business decisions with potentially disastrous financial consequences.
In the meantime, memory costs have continued to decline while hardware performance continually improves. With access to larger
amounts of memory on faster machines, aggregation of data in memory is significantly faster than reading aggregated data from disk. Those factors make
tabular and in-memory models more appealing. Additionally, tabular models are based on more familiar concepts of rows, columns, tables and relationships.
This isn’t to say that mistakes in Tabular modeling aren’t just as costly as mistakes in dimensional modeling. Just imagine the joy of painstakingly
crafting a Tabular model only to find that the most commonly and important query executed against the model consistently takes in excess of 50 minutes to
return a result or that the most crucial DAX calculation returns a value that is incorrect. Business Considerations
I have long lived by the mantra that “the primary purpose of a Business Intelligence solution is
to deliver positive value to the business.” Anything short of that runs a serious risk of becoming a business liability. From the perspective of a
business, that involves providing business people with both the information and the tools necessary for making sound operational and strategic
business decisions.
Feature Set
Selection of a model, therefore, is very much dependent on which features are necessary to deliver information in such a way that it
delivers the desired business value. Multidimensional models, in the current form, have been around for about 11 years and have a very rich feature set. Tabular models, by
way of contrast, were first introduced as PowerPivot 5 years ago. Therefore, it should come as no surprise that Multidimensional models provide a much richer
feature set. The table below displays the feature sets of both Multidimensional and Tabular models.
Feature comparison between Multidimensional, Tabular and PowerPivot Models.
Feature |
Multidimensional |
Tabular |
PowerPivot |
Actions |
Yes |
No |
No |
Aggregations |
Yes |
No |
No |
Calculated Measures |
Yes |
Yes |
Yes |
Custom Assemblies |
Yes |
No |
No |
Custom Rollups |
Yes |
No |
No |
Data Mining |
Yes |
No |
No |
Distinct Count |
Yes |
Yes (via DAX) |
Yes (using DAX) |
Drill through |
Yes |
Yes |
Yes |
Hierarchies |
Yes |
Yes |
Yes |
KPIs |
Yes |
Yes |
Yes |
Linked objects |
Yes |
No |
Yes (links to Excel tables in the hosting workbook) |
Many-to-many relationships |
Yes |
No (can be modeled using DAX functions) |
No (can be modeled using DAX functions) |
Parent-child Hierarchies |
Yes |
Yes (can be modeled using the DAX PATH, PATHLENGTH, LOOKUPVALUE, and PATHITEM functions) |
Yes (model using DAX PATH, PATHLEHGTH, LOOKUPVALUE, and PATHITEM functions) |
Partitions |
Yes |
Yes |
No |
Perspectives |
Yes |
Yes |
Yes |
Semi-additive Measures |
Yes |
Yes |
Yes |
Translations |
Yes |
Using DAX |
Using DAX |
User-defined Hierarchies |
Yes |
Yes |
Yes |
Write back |
Yes |
No |
No |
Consider the functionalities and features that are not available in Tabular/PowerPivot models. Actions, for instance, are useful when
it is desirable to do something like launch a Reporting Services Report, Provide Custom Drillthrough behavior, or direct to a URL. Custom Assemblies can
be used to implement custom security or functionalities that are not natively present in Analysis Services. Custom Rollups and Unary operators are frequently
used with charts of accounts. Writeback functionality is a very highly desirable feature for performing “What-if” types of analyses.
Available skillset
For a number of reasons, the available skillset is a significant consideration. Part of that is related to
the learning curves that are associated with the two approaches to database design. Amir Netz has compared mastery of the skills associated with
Multidimensional database design and Tabular Database design to the differences between mastering C++ and Visual Basic programming languages.
Multidimensional database design is closely bound to principles of Dimensional Design, involving Dimensions,
Hierarchies, Attributes, Relationships and Measures. Multidimensional database design also makes extensive use of the MDX language. The principles underlying
dimensional modeling differ significantly from those of entity relationship design and resources with dimensional modeling skills tend to be difficult to
secure. Another consideration is that mistakes in dimensional modeling, frequently have significant direct costs due to the need for design changes and
indirect costs resulting from business decisions based on inaccurate or misleading data.
Tabular models, on the other hand, are based on more familiar concepts of Tables, Rows, Columns and
Relationships. Those are concepts are very familiar to DBAs and most Excel users. Add to that, the DAX language is modeled after Excel functions and is
therefore very familiar to Excel users. Multidimensional models, on the other hand, are based on dimensional modeling concepts and Multidimensional
Expressions (MDX). Conversely, selecting the incorrect model for the complexity of the business questions can also be expensive. In either case, it is
important that the architect/consultant/developer have a good grasp of:
- Features
- Scripting Language to be used
- Limitations
- Formula Engine
- Storage Engine
Complexity of the Business Process being Modeled
The complexity of the model is very closely related to the other considerations for selecting a model. TABULAR
models are relatively easy to implement because they’re based on tables, rows, columns and relationships but have a limited feature set. Multidimensional
models are built around dimensional models and are much more complex to implement but they also provide a richer model because of the feature set. In
short, Tabular models are great for models that will fit into volatile memory and aren’t highly complex while Multidimensional models are more suitable for
highly complex business models and data volumes that won’t fit entirely into memory.
Technical Considerations
Volume of Data
Another significant consideration is data volumes. Tabular models are entirely in-memory and to get decent
performance, the server will need to have at least (2.6 * the size of the data files on disk) to allow for inflation of the files in memory, processing
operations, disaster recovery, etc. Multidimensional models load data into memory on a partition by partition basis and periodically clean unused
partitions from memory, so it’s possible to server larger volumes of data (read that as multiple Terabytes) on commodity hardware.
Feature Set
As indicated previously, Multidimensional models have been around quite a bit longer than Tabular
models. Commonly used features of Multidimensional models that aren’t available in Tabular models include Actions, Aggregations, Custom Assemblies, Custom
Rollups, Writeback functionality. Tabular models also don’t directly Many-to-Many dimensions, or Parent Child dimensions although it’s possible to model either
using DAX expressions. While Multidimensional has the familiar MOLAP, ROLAP and HOLAP storage modes, Tabular models have two separate storage modes (In-Memory
and Direct Query). In-Memory loads the entire data set into memory when the metadata for a database is requested (usually on startup). As of this writing, Direct
Query is only available when using SQL Server as the data source and executes queries against SQL Server.
Needs for Query and Scripting Support
Analysis Services includes MDX, DMX, DAX, XML/A, and ASSL. Support for these languages varies slightly by model type. If query and
scripting language requirements are a consideration, it becomes important to know which scripting languages are supported by the models. Very briefly:
PowerPivot
Uses DAX for calculations and queries
Supports XMLA
Tabular Models
Support DAX Calculations
Support DAX Queries
Support MDX Queries
Supports XMLA
PowerShell
Traditional Analysis Services
Support MDX Calculations
Support MDX Queries
Support XMLA
PowerShell
Security
All Analysis Services solutions can be secured at the database level. More granular security options vary by mode. With PowerPivot,
security is essentially file based. Using either Tabular or Multidimensional models, security is role based although there are differences in the security
implementations. Tabular models use roles to implement row-level security that is based on DAX formulas. Multidimensional models use roles to implement dimension
and cell-level security.
Need for Extensibility
Since Tabular models don’t support Custom Assemblies, it’s simply not possible to add functionalities
using a custom .NET Assembly or COM application. Multidimensional models, conversely, do support custom assemblies which have been used to implement
custom security solutions, as well as add functionalities that aren’t native to MDX. There are a number of examples of such assemblies on the web as well as
CodePlex.
Data Sources
Multidimensional models support a limited set of data sources, which currently include MS Access, SQL Server, SQL
Azure, Parallel Data Warehouse (PDW)/Analytics Platform System (APS), Oracle, DB2, Informix, Teradata and Sybase. With Tabular models, as long as there is an
OLE DB Provider, ODBC Driver, or REST data source, it’s possible to access the data for processing.