Creating a Solution and Data Source (Intermediate Data Mining Tutorial)
To work with data mining, you must first create a project in SQL Server Data Tools (SSDT) using the template, Analysis Services Multidimensional and Data Mining Project. When you open the template, it loads into the designer all the schemas that you might need for data mining: data sources, mining structures and mining models, and even cubes if your mining structure uses multidimensional data.
When you create the project, your solution is stored as a local file until the solution is deployed. When you deploy the solution, Analysis Services looks for the Analysis Services server specified in the project properties, and creates a new Analysis Services database with the same name as the project. By default, Analysis Services uses the** localhost** instance for new projects. If you are using a named instance, or if you specified a different name for the default instance, you must change the deployment database property of the project to the location where you want to create your data mining objects.
For more information about Analysis Services projects, see Create an Analysis Services Project (SSDT).
To create a new Analysis Services project for this tutorial
Open SQL Server Data Tools (SSDT).
On the File menu, point to New, and then click Project.
Select Analysis Services Multidimensional and Data Mining Project from the Installed Templates pane.
In the Name box, name the new project DM Intermediate.
Click OK.
To change the instance where data mining objects are stored (optional)
In SQL Server Data Tools (SSDT), on the Project menu, click Properties.
In the left side of the Property Pages pane, click Deployment.
Verify that the Server name is localhost. If you are using a different instance, type the name of the instance. If you are using a named instance of Analysis Services, type the machine name and then the instance name. Click OK.
To change the deployment properties for a project (optional)
In Solution Explorer, right-click the project, and then select Properties.
-- or --
In SQL Server Data Tools (SSDT), on the Project menu, select Properties.
In the left side of the Property Pages pane, click Deployment.
In the Options pane, select Deployment Mode, and set the options to Deploy All to overwrite, or to Deploy Changes Only to update objects or add objects.
Creating a Data Source
In the Basic Data Mining Tutorial, you created a data source that stores connection information for the AdventureWorksDW2012 database. Follow the same steps to create the AdventureWorksDW2012 data source in this solution.
To create a data source
A single data source can support multiple data source views, and each data source view can have multiple tables. However, because the data source and data source view are deployed to your Microsoft SQL Server Analysis Services database together with the data mining models that you create, as a best practice you should include in each data source view only those tables that are required for each data mining model or group of models.
In the following lessons, you will add data source views to support each of the new scenarios. Only the market basket and sequence clustering lessons use the same data source view; otherwise, each scenario uses a different data source view, so the lessons are independent of each other and can be completed separately.
Scenario |
Data included in the data source view |
---|---|
Lesson 2: Building a Forecasting Scenario (Intermediate Data Mining Tutorial) |
Monthly sales reports for bicycle models in different regions, collected as a single view. |
Lesson 3: Building a Market Basket Scenario (Intermediate Data Mining Tutorial) |
A table containing a list of customer orders, and a nested table showing the individual purchases for each customer. |
Lesson 4: Building a Sequence Clustering Scenario (Intermediate Data Mining Tutorial) |
The same data that is used for the market basket analysis, with the addition of an identifier that shows the order in which items were purchased. |
Lesson 5: Building Neural Network and Logistic Regression Models (Intermediate Data Mining Tutorial) |
A single table containing some preliminary performance tracking data from a call center. |
Next Lesson
Lesson 2: Building a Forecasting Scenario (Intermediate Data Mining Tutorial)