KnownTokenizerNames enum

參考

套件:: @azure/search-documents

服務接受的 LexicalTokenizerName 已知值。

欄位

Classic	適用于處理大部分歐洲語言檔的文法型 Tokenizer。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/ClassicTokenizer.html
EdgeNGram	將邊緣的輸入權杖化為指定大小 (s) 的 n-gram。請參閱 https://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenizer.html
Keyword	以單一語彙基元的形式發出整個輸入。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/KeywordTokenizer.html
Letter	在非字母的位置上分割文字。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LetterTokenizer.html
Lowercase	在非字母的位置分割文字，並將其轉換成小寫。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LowerCaseTokenizer.html
MicrosoftLanguageStemmingTokenizer	使用語言特有的規則來分割文字，並將字組縮減到其基本形式。
MicrosoftLanguageTokenizer	使用語言特有的規則分割文字。
NGram	將輸入 Token 化到指定的 n-gram 大小。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenizer.html
PathHierarchy	路徑類階層的 Token 化工具。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/path/PathHierarchyTokenizer.html
Pattern	使用 RegEx 模式比對來建構不同權杖的 Tokenizer。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternTokenizer.html
Standard	標準 Lucene 分析器;由標準 Tokenizer、小寫篩選和停止篩選所組成。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizer.html
UaxUrlEmail	將 URL 和電子郵件 Token 化為一個語彙基元。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/UAX29URLEmailTokenizer.html
Whitespace	在空白字元處分割文字。請參閱 http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/WhitespaceTokenizer.html