WordTokenizingEstimator Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Tokenizes input text using specified delimiters.
public sealed class WordTokenizingEstimator : Microsoft.ML.Data.TrivialEstimator<Microsoft.ML.Transforms.Text.WordTokenizingTransformer>
type WordTokenizingEstimator = class
inherit TrivialEstimator<WordTokenizingTransformer>
Public NotInheritable Class WordTokenizingEstimator
Inherits TrivialEstimator(Of WordTokenizingTransformer)
- Inheritance
Remarks
Estimator Characteristics
Does this estimator need to look at the data to train its parameters? | No |
Input column data type | Scalar or Vector of Text |
Output column data type | Variable-size vector of Text |
Exportable to ONNX | Yes |
The resulting WordTokenizingTransformer creates a new column, named as specified in the output column name parameters, where each input string is mapped to a vector of substrings obtained by splitting the input string according to the user defined delimiters. The space character is the default delimiter.
Empty strings and strings containing only spaces are dropped.
Check the See Also section for links to usage examples.
Methods
Fit(IDataView) | (Inherited from TrivialEstimator<TTransformer>) |
GetOutputSchema(SchemaShape) |
Returns the SchemaShape of the schema which will be produced by the transformer. Used for schema propagation and verification in a pipeline. |
Extension Methods
AppendCacheCheckpoint<TTrans>(IEstimator<TTrans>, IHostEnvironment) |
Append a 'caching checkpoint' to the estimator chain. This will ensure that the downstream estimators will be trained against cached data. It is helpful to have a caching checkpoint before trainers that take multiple data passes. |
WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>) |
Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called. |