RecommendationCatalog.CrossValidate Method
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Run cross-validation over numberOfFolds
folds of data
, by fitting estimator
,
and respecting samplingKeyColumnName
if provided.
Then evaluate each sub-model against labelColumnName
and return metrics.
public System.Collections.Generic.IReadOnlyList<Microsoft.ML.TrainCatalogBase.CrossValidationResult<Microsoft.ML.Data.RegressionMetrics>> CrossValidate (Microsoft.ML.IDataView data, Microsoft.ML.IEstimator<Microsoft.ML.ITransformer> estimator, int numberOfFolds = 5, string labelColumnName = "Label", string samplingKeyColumnName = default, int? seed = default);
member this.CrossValidate : Microsoft.ML.IDataView * Microsoft.ML.IEstimator<Microsoft.ML.ITransformer> * int * string * string * Nullable<int> -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.TrainCatalogBase.CrossValidationResult<Microsoft.ML.Data.RegressionMetrics>>
Public Function CrossValidate (data As IDataView, estimator As IEstimator(Of ITransformer), Optional numberOfFolds As Integer = 5, Optional labelColumnName As String = "Label", Optional samplingKeyColumnName As String = Nothing, Optional seed As Nullable(Of Integer) = Nothing) As IReadOnlyList(Of TrainCatalogBase.CrossValidationResult(Of RegressionMetrics))
Parameters
- data
- IDataView
The data to run cross-validation on.
- estimator
- IEstimator<ITransformer>
The estimator to fit.
- numberOfFolds
- Int32
Number of cross-validation folds.
- labelColumnName
- String
The label column (for evaluation).
- samplingKeyColumnName
- String
Optional name of the column to use as a stratification column. If two examples share the same value of the samplingKeyColumnName
(if provided), they are guaranteed to appear in the same subset (train or test). Use this to make sure there is no label leakage from train to the test set.
If this optional parameter is not provided, a stratification columns will be generated, and its values will be random numbers .
Optional parameter used in combination with the samplingKeyColumnName
.
If the samplingKeyColumnName
is not provided, the random numbers generated to create it, will use this seed as value.
And if it is not provided, the default value will be used.
Returns
Per-fold results: metrics, models, scored datasets.