Share via


TreeOptions Class

Definition

Options for tree trainers.

public abstract class TreeOptions : Microsoft.ML.Trainers.TrainerInputBaseWithGroupId
type TreeOptions = class
    inherit TrainerInputBaseWithGroupId
Public MustInherit Class TreeOptions
Inherits TrainerInputBaseWithGroupId
Inheritance
Derived

Constructors

TreeOptions()

Fields

AllowEmptyTrees

When a root split is impossible, allow training to proceed.

BaggingExampleFraction

Percentage of training examples used in each bag. Default is 0.7 (70%).

BaggingSize

Number of trees in each bag (0 for disabling bagging).

Bias

Bias for calculating gradient for each feature bin for a categorical feature.

Bundling

Bundle low population bins. Bundle.None(0): no bundling, Bundle.AggregateLowPopulation(1): Bundle low population, Bundle.Adjacent(2): Neighbor low population bundle.

CategoricalSplit

Whether to do split based on multiple categorical feature values.

CompressEnsemble

Compress the tree Ensemble.

DiskTranspose

Whether to utilize the disk or the data's native transposition facilities (where applicable) when performing the transpose.

EntropyCoefficient

The entropy (regularization) coefficient between 0 and 1.

ExampleWeightColumnName

Column to use for example weight.

(Inherited from TrainerInputBaseWithWeight)
ExecutionTime

Print execution time breakdown to ML.NET channel.

FeatureColumnName

Column to use for features.

(Inherited from TrainerInputBase)
FeatureFirstUsePenalty

The feature first use penalty coefficient.

FeatureFlocks

Whether to collectivize features during dataset preparation to speed up training.

FeatureFraction

The fraction of features (chosen randomly) to use on each iteration. Use 0.9 if only 90% of features is needed. Lower numbers help reduce over-fitting.

FeatureFractionPerSplit

The fraction of features (chosen randomly) to use on each split. If it's value is 0.9, 90% of all features would be dropped in expectation.

FeatureReusePenalty

The feature re-use penalty (regularization) coefficient.

FeatureSelectionSeed

The seed of the active feature selection.

GainConfidenceLevel

Tree fitting gain confidence requirement. Only consider a gain if its likelihood versus a random choice gain is above this value.

HistogramPoolSize

The number of histograms in the pool (between 2 and numLeaves).

LabelColumnName

Column to use for labels.

(Inherited from TrainerInputBaseWithLabel)
MaximumBinCountPerFeature

Maximum number of distinct values (bins) per feature.

MaximumCategoricalGroupCountPerNode

Maximum categorical split groups to consider when splitting on a categorical feature. Split groups are a collection of split points. This is used to reduce overfitting when there many categorical features.

MaximumCategoricalSplitPointCount

Maximum categorical split points to consider when splitting on a categorical feature.

MemoryStatistics

Print memory statistics to ML.NET channel.

MinimumExampleCountPerLeaf

The minimal number of data points required to form a new tree leaf.

MinimumExampleFractionForCategoricalSplit

Minimum categorical example percentage in a bin to consider for a split. Default is 0.1% of all training examples.

MinimumExamplesForCategoricalSplit

Minimum categorical example count in a bin to consider for a split.

NumberOfLeaves

The max number of leaves in each regression tree.

NumberOfThreads

The number of threads to use.

NumberOfTrees

Total number of decision trees to create in the ensemble.

RowGroupColumnName

Column to use for example groupId.

(Inherited from TrainerInputBaseWithGroupId)
Seed

The seed of the random number generator.

Smoothing

Smoothing parameter for tree regularization.

SoftmaxTemperature

The temperature of the randomized softmax distribution for choosing the feature.

SparsifyThreshold

Sparsity level needed to use sparse feature representation.

TestFrequency

Calculate metric values for train/valid/test every k rounds.

Applies to