Dela via


PreTokenizer.PreTokenize(String) Method

Definition

Splits the given string in multiple substrings at the word boundary, keeping track of the offsets of said substrings from the original string.

public abstract System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split> PreTokenize (string sentence);
abstract member PreTokenize : string -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split>
Public MustOverride Function PreTokenize (sentence As String) As IReadOnlyList(Of Split)

Parameters

sentence
String

The string to split into tokens.

Returns

The list of the splits containing the tokens and the token's offsets to the original string.

Applies to