Data Mining Tools
Microsoft SQL Server Analysis Services provides the following tools that you can use to create data mining solutions:
The Data Mining Wizard in SQL Server Data Tools (SSDT) makes it easy to create mining structures and mining models, using either relational data sources or multidimensional data in cubes.
In the wizard, you choose data to use, and then apply specific data mining techniques, such as clustering, neural networks, or time series modeling.
Model viewers are provided in both SQL Server Management Studio and SQL Server Data Tools (SSDT), for exploring your mining models after they are created. You can browse models using viewers tailored to each algorithm, or go deeper into analysis by using the model content viewer.
The Prediction Query Builder is provided in both SQL Server Management Studio and SQL Server Data Tools (SSDT) to help you create prediction queries. You can also test the accuracy of models against a holdout data set or external data, or use cross-validation to assess the quality of your data set.
SQL Server Management Studio is the interface where you manage existing data mining solutions that have been deployed to an instance of Analysis Services. You can reprocess structures and models to update the data in them.
SQL Server Integration Services contains tools that you can use to clean data, to automate tasks such as creating predictions and updating models, and to create text mining solutions.
The following sections provide more information about the data mining tools in SQL Server.
Data Mining Wizard
Use the Data Mining Wizard to get started creating data mining solutions. The wizard is quick and easy, and guides you through the process of creating a data mining structure and an initial related mining model, and includes the tasks of selecting an algorithm type and a data source, and defining the case data used for analysis.
For More Information: Data Mining Wizard (Analysis Services - Data Mining)
Data Mining Designer
After you have created a mining structure and mining model by using the Data Mining Wizard, you can use the Data Mining Designer from either SQL Server Data Tools (SSDT) or SQL Server Management Studio to work with existing models and structures.
The designer includes tools for these tasks:
Modify the properties of mining structures, add columns and create column aliases, change the binning method or expected distribution of values.
Add new models to an existing structure; copy models, change model properties or metadata, or define filters on a mining model.
Browse the patterns and rules within the model; explore associations or decision trees. Get detailed statistics about
Custom viewers are provided for each different time of model, to help you analyze your data and explore the patterns revealed by data mining.
Validate models by creating lift charts, or analyzing the profit curve for models. Compare models using classification matrices, or validate a data set and its models by using cross-validation.
Create predictions and content queries against existing mining models. Build one-off queries, or set up queries to generate predictions for entire tables of external data.
For More Information: Data Mining Designer
SQL Server Management Studio
After you create and deploy mining models to a server, you can use SQL Server Management Studio to manage the Analysis Services database that hosts the data mining objects. You can also continue to perform tasks that use the model, such as exploring the models, processing new data, and creating predictions.
Management Studio also contains query editors that you can use to design and execute Data Mining Extensions (DMX) queries, or ot work with data mining objects by using XMLA.
Integration Services Data Mining Tasks and Transformations
SQL Server Integration Services provides many components that support data mining.
Some tools in Integration Services are designed to help automate common data mining tasks, including prediction, model building, and processing. For example:
Create an Integration Services package that automatically updates the model every time the dataset is updated with new customers
Perform custom segmentation or custom sampling of case records.
Automatically generate models passed on parameters.
However, you can also use data mining in a package workflow, as an input to other processes. For example:
Use probability values generated by the model to weight scores for text mining or other classification tasks.
Automatically generate predictions based on prior data and use those values to assess the validity of new data.
Using logistic regression to segment incoming customers by risk.
For More Information: Related Projects for Data Mining Solutions
See Also
Data Mining Extensions (DMX) Reference
Mining Model Tasks and How-tos
Mining Model Viewer Tasks and How-tos
Data Mining Solutions