Share via


What's new in SQL 2008 Data Mining

 

Engine and algorithm improvements: To improve the accuracy and stability of some predictions in time series models, a new algorithm has been added to the Microsoft Time Series algorithm. Based on the well-known ARIMA algorithm, the new algorithm provides better long-term predictions than the ARTxp algorithm that Analysis Services has been using. (ARTxp is an auto-regressive tree algorithm that is optimized for either a single time slice or short-term predictions.)

Enhanced mining structures: When creating a mining structure, you can now divide the data in the mining structure into permanent training and testing sets. The definition of the partition is stored with the structure, and you can reuse the test set with any mining models that are based on that structure.

You can now query data cached in a mining structure, much like you could query case detail from a model.

You can now attach filters to a mining model and use the filter during both training and testing. Applying a filter to the model lets you control the data that is used to train the model and lets you more easily assess the performance of the model on subsets of the data.

Cross-validation is an established method of assessing the accuracy of data mining models. In cross-validation, you iteratively partition the mining structure data into subsets, build models on the subsets, and then measure the accuracy of the model for each partition. By reviewing the returned statistics, you can determine how reliable the mining model is and more easily compare models that are based on the same structure.

Data Mining add-ins for Office 2007: Market Basket Analysis and Prediction Calculator have been added.

For more info:

What's New (Analysis Services - Data Mining): https://msdn2.microsoft.com/en-us/library/bb510513(SQL.100).aspx