Share via


Bpe Constructors

Definition

Overloads

Bpe()

Construct a new Bpe model object with no tokenization vocabulary. This constructor is useful only in the training scenario.

Bpe(String, String, String, String, String)

Construct a new Bpe model object to use for sentence tokenization and tokenizer training.

Bpe()

Construct a new Bpe model object with no tokenization vocabulary. This constructor is useful only in the training scenario.

public Bpe ();
Public Sub New ()

Applies to

Bpe(String, String, String, String, String)

Construct a new Bpe model object to use for sentence tokenization and tokenizer training.

public Bpe (string vocabFile, string? mergesFile, string? unknownToken = default, string? continuingSubwordPrefix = default, string? endOfWordSuffix = default);
new Microsoft.ML.Tokenizers.Bpe : string * string * string * string * string -> Microsoft.ML.Tokenizers.Bpe
Public Sub New (vocabFile As String, mergesFile As String, Optional unknownToken As String = Nothing, Optional continuingSubwordPrefix As String = Nothing, Optional endOfWordSuffix As String = Nothing)

Parameters

vocabFile
String

The JSON file path containing the dictionary of string keys and their ids.

mergesFile
String

The file path containing the tokens's pairs list.

unknownToken
String

The unknown token to be used by the model.

continuingSubwordPrefix
String

The prefix to attach to sub-word units that don’t represent a beginning of word.

endOfWordSuffix
String

The suffix to attach to sub-word units that represent an end of word.

Applies to