System Stored Procedures for Decision Tree and Linear Regression Models (Data Mining)
This article is a continuation of the following article that introduces the system stored procedures for SQL Server Data Mining.
The following stored procedures can be used with models that use the Microsoft Decision Trees algorithm. You can also use them with linear regression models, because regression models are based on the Microsoft Decision Trees algorithm.
DTAddNodes
Adds new nodes to an existing graph and returns a complete graph of existing and new nodes.
Fully qualified procedure name
System.Microsoft.AnalysisServices.System.DataMining.DecisionTrees.DTAddNodes
Parameters
Model name as string
List of new node IDs as string
List of existing node IDs as string
Separate node IDs by using a semi-colon.
Example
This procedure returns a table similar to the NODE_TYPE=2 section of DTGetNodeGraph, but without the NODE_TYPE column.
The rows in the result set contain all the edges between the nodes added by the second argument, and all of the edges between the added nodes and the nodes already in the graph, as specified in the third argument.
CALL System.DTAddNodes ([TM Decision Tree',’1;3’,’2;4’)
Sample results
NODE_UNIQUE_NAME_1 |
NODE_UNIQUE_NAME_1 |
MSOLAP_NODE_SCORE |
2 |
1 |
2 |
4 |
1 |
4 |
3 |
1 |
1 |
Notes
This stored procedure is used by the dependency network viewer.
See also
No additional resources
CalculateTreeDepths
Returns the depth of all trees in the model. A decision tree model generates a separate tree for each predictable output.
Fully qualified procedure name
System.CalculateTreeDepths
Parameters
Model name as string
Example
This example returns the depth of the tree for the sample model, which has only one predictable attribute, Bike Buyer.
The number for tree depth includes the (All) node for the model, but does not include the model root node.
CALL System.CalculateTreeDepths('TM_Decision_Tree')
Sample results
ATTRIBUTE_NAME |
NODE_UNIQUE_NAMe |
TREE_DEPTH |
Bike Buyer |
000000001 |
11 |
Notes
To find all leaf nodes in a model, you can search for nodes that have CHILDREN_CARDINALITY equal to 0.
The length of the node unique IDs is also an indicator of the node’s depth in the tree, since the IDs are created in such a way that you can identify the parent node based on the child node ID value. For example, compare these node IDs in the decision tree model. Note that the ID of the parent is repeated in each child node.
Node name or attribute |
ID |
(All) node |
000000001 |
Age < 38 First child of (All) node |
00000000100 |
Age >= 38 and < 62 Second child of (All) node
|
00000000101 |
Number of Cars = 0 |
0000000010000 |
Number of Cars =! 0 |
0000000010001 |
See also
No additional resources
DTGetNodeGraph
Returns a list of nodes that can be used to construct a dependency network graph.
Fully qualified procedure name
System.Microsoft.AnalysisServices.System.DataMining.DecisionTrees.DTGetNodeGraph
Parameters
Model name as string
Number of nodes to display as integer
Example
CALL System.DTGetNodeGraph ('[TM Decision Tree]',5)
Sample results
NODE_TYPE |
NODE_UNIQUE_NAME_1 |
NODE_UNIQUE_NAME_2 |
MS_OLAP_NODE_SCORE |
1 |
6 |
Number Cars Owned |
|
1 |
1 |
Bike Buyer |
|
1 |
0 |
Age |
|
2 |
6 |
1 |
9 |
2 |
0 |
1 |
10 |
The columns returned by this procedure have the following content:
NODE_TYPE:
Has either the value 1 or 2. Depending on this valuetype, the content of the remaining columns will be different. In general, when NODE_TYPE=1, the remaining columns describe the node contents. When NODE_TYPE=2, the remaining columns describe the graph.NODE_UNIQUE_NAME1:
The contents of this column depend on the value type for NODE_TYPE:
If the value is 1, NODE_UNIQUE_NAME1 represents a unique identifier for the nodes, which is only for use within the graph, and does not necessarily match the NODE_UNIQUE_NAME in the model content.
If the value is 2, both NODE_UNIQUE_NAME1 and NODE_UNIQUE_NAME2 represent the endpoints of a directed edge in the dependency network graph.
NODE_UNIQUE_NAME2:
If NODE_TYPE=2, contains information from the node caption.
If NODE_TYPE=2, both NODE_UNIQUE_NAME1 and NODE_UNIQUE_NAME2 contain the endpoints of a directed edge in the dependency network graph.
MSOLAP_NODE_SCORE:
If NODE_TYPE=1, the column is blank.
If NODE_TYPE=2, the column MSOLAP_NODE_SCORE contains the relative weight of the edge that is used to show and hide edges when the slider is moved in the dependency network viewer.
Notes
This stored procedure is used to populate the dependency network graph.
If there are more nodes in the model for both input and output attributes than the number specified in the arguments, the stored procedure balances the result set so that a fair number of both inputs and outputs are represented and an attractive graph can be rendered.
See also
No additional resources
DTGetNodes
Returns a row for all potential nodes in the dependency network representation of a decision tree model.
Fully qualified procedure name
System.Microsoft.AnalysisServices.System.DataMining.DecisionTrees.DTGetNodes
Parameters
Model name as string
Example
CALL System.DTGetNodes ('[TM_Decision_Tree]’)
Sample results
NODE-UNIQUE_NAME |
NODE_CAPTION |
0 |
Age |
1 |
Bike Buyer |
4 |
English Occupation |
3 |
English Education |
The columns returned by this procedure are defined as follows:
NODE_UNIQUE_NAME1:
An identifier that is unique for the dependency network. The value is not necessarily related to the NODE_UNIQUE_NAME in the model content.NODE_CAPTION:
The name of the node.
Notes
DTGetNodes is used by the decision tree dependency network viewer when you click the Add Nodes button.
See also
No additional resources
GetTreeScores
Takes the name of a decision tree model and returns a table containing a row for every tree in the model. A tree represents an analysis of the factors that affect a single target variable.
Fully qualified procedure name
System.Microsoft.AnalysisServices.System.DataMining.DecisionTrees.GetTreeScores
Parameters
Model name as string
Example
CALL System.GetTreeScores('TM_DecisionTree')
Sample results
ATTRIBUTE_NAME |
NODE_UNIQUE_NAME |
MS_OLAP_NOODE_SCORE |
Bike Buyer |
000000001 |
0.9 |
The columns returned by this procedure are defined as follows:
ATTRIBUTE_NAME:
The name of the tree. For example, in the AdventureWorks sample decision trees model, the tree name is [Bike buyer], after the target attribute.NODE_UNIQUE_NAME:
The content node representing the root of the tree. This information is useful in complex models with many outcomes.MSOLAP_NODE_SCORE:
A number representing the amount of information in the tree. This number is roughly related to the number of nodes in each tree.
Notes
The decision tree viewer uses this procedure to populate the drop-down tree selector.
See also
The following article describes a scenario where this stored procedure might be useful.
Rendering a Decision Tree using Reporting Services