Share via


TreeEnsembleFeaturizationEstimatorBase Class

Definition

This class encapsulates the common behavior of all tree-based featurizers such as FastTreeBinaryFeaturizationEstimator, FastForestBinaryFeaturizationEstimator, FastTreeRegressionFeaturizationEstimator, FastForestRegressionFeaturizationEstimator, and PretrainedTreeFeaturizationEstimator. All tree-based featurizers share the same output schema computed by GetOutputSchema(SchemaShape). All tree-based featurizers requires an input feature column name and a suffix for all output columns. The ITransformer returned by Fit(IDataView) produces three columns: (1) the prediction values of all trees, (2) the IDs of leaves the input feature vector falling into, and (3) the binary vector which encodes the paths to those destination leaves.

public abstract class TreeEnsembleFeaturizationEstimatorBase : Microsoft.ML.IEstimator<Microsoft.ML.Trainers.FastTree.TreeEnsembleFeaturizationTransformer>
type TreeEnsembleFeaturizationEstimatorBase = class
    interface IEstimator<TreeEnsembleFeaturizationTransformer>
Public MustInherit Class TreeEnsembleFeaturizationEstimatorBase
Implements IEstimator(Of TreeEnsembleFeaturizationTransformer)
Inheritance
TreeEnsembleFeaturizationEstimatorBase
Derived
Implements

Methods

Fit(IDataView)

Produce a TreeEnsembleModelParameters which maps the column called InputColumnName in input to three output columns.

GetOutputSchema(SchemaShape)

PretrainedTreeFeaturizationEstimator adds three float-vector columns into inputSchema. Given a feature vector column, the added columns are the prediction values of all trees, the leaf IDs the feature vector falls into, and the paths to those leaves.

Extension Methods

AppendCacheCheckpoint<TTrans>(IEstimator<TTrans>, IHostEnvironment)

Append a 'caching checkpoint' to the estimator chain. This will ensure that the downstream estimators will be trained against cached data. It is helpful to have a caching checkpoint before trainers that take multiple data passes.

WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>)

Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called.

Applies to