Dela via


RobertaPreTokenizer.PreTokenize Method

Definition

Overloads

PreTokenize(ReadOnlySpan<Char>)

Get the offsets and lengths of the tokens relative to the text.

PreTokenize(String)

Get the offsets and lengths of the tokens relative to the text.

PreTokenize(ReadOnlySpan<Char>)

Source:
RobertaPreTokenizer.cs
Source:
RobertaPreTokenizer.cs
Source:
RobertaPreTokenizer.cs

Get the offsets and lengths of the tokens relative to the text.

public override System.Collections.Generic.IEnumerable<(int Offset, int Length)> PreTokenize(ReadOnlySpan<char> text);
override this.PreTokenize : ReadOnlySpan<char> -> seq<ValueTuple<int, int>>
Public Overrides Function PreTokenize (text As ReadOnlySpan(Of Char)) As IEnumerable(Of ValueTuple(Of Integer, Integer))

Parameters

text
ReadOnlySpan<Char>

The string to split into tokens.

Returns

The offsets and lengths of the tokens, expressed as pairs, are relative to the original string.

Applies to

PreTokenize(String)

Source:
RobertaPreTokenizer.cs
Source:
RobertaPreTokenizer.cs
Source:
RobertaPreTokenizer.cs

Get the offsets and lengths of the tokens relative to the text.

public override System.Collections.Generic.IEnumerable<(int Offset, int Length)> PreTokenize(string text);
public override System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split> PreTokenize(string? sentence);
override this.PreTokenize : string -> seq<ValueTuple<int, int>>
override this.PreTokenize : string -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split>
Public Overrides Function PreTokenize (text As String) As IEnumerable(Of ValueTuple(Of Integer, Integer))
Public Overrides Function PreTokenize (sentence As String) As IReadOnlyList(Of Split)

Parameters

textsentence
String

The string to split into tokens.

Returns

The offsets and lengths of the tokens, expressed as pairs, are relative to the original string.

Applies to