TextCatalog 類別

參考

定義

命名空間:: Microsoft.ML

組件:: Microsoft.ML.Transforms.dll

套件:: Microsoft.ML v3.0.1

套件:: Microsoft.ML v1.0.0

套件:: Microsoft.ML v1.1.0

套件:: Microsoft.ML v1.2.0

套件:: Microsoft.ML v1.3.1

套件:: Microsoft.ML v1.4.0

套件:: Microsoft.ML v1.5.5

套件:: Microsoft.ML v1.6.0

套件:: Microsoft.ML v1.7.0

套件:: Microsoft.ML v2.0.0

重要

部分資訊涉及發行前產品，在發行之前可能會有大幅修改。 Microsoft 對此處提供的資訊，不做任何明確或隱含的瑕疵擔保。

的 TransformsCatalog擴充方法集合。

public static class TextCatalog

type TextCatalog = class

Public Module TextCatalog

繼承: Object
TextCatalog

方法

ApplyWordEmbedding(TransformsCatalog+TextTransforms, String, String, String)	建立， WordEmbeddingEstimator這是文字特徵化工具，它會使用預先定型的內嵌模型，將文字向量轉換成數值向量。
ApplyWordEmbedding(TransformsCatalog+TextTransforms, String, String, WordEmbeddingEstimator+PretrainedModelKind)	建立， WordEmbeddingEstimator這是文字特徵化工具，它會使用預先定型的內嵌模型，將文字向量轉換成數值向量。
FeaturizeText(TransformsCatalog+TextTransforms, String, String)	建立， TextFeaturizingEstimator將文字數據行轉換成特徵化向量 Single ，代表 n-gram 和 char-gram 的正規化計數。
FeaturizeText(TransformsCatalog+TextTransforms, String, TextFeaturizingEstimator+Options, String[])	建立， TextFeaturizingEstimator將文字數據行轉換成特徵化向量 Single ，代表 n-gram 和 char-gram 的正規化計數。
LatentDirichletAllocation(TransformsCatalog+TextTransforms, String, String, Int32, Single, Single, Int32, Int32, Int32, Int32, Int32, Int32, Int32, Boolean)	建立 LatentDirichletAllocationEstimator，它會使用 LightLDA 將以浮) 點數向量表示的文字 (轉換成向量 Single ，指出每個已識別主題的文字相似度。
NormalizeText(TransformsCatalog+TextTransforms, String, String, TextNormalizingEstimator+CaseMode, Boolean, Boolean, Boolean)	建立，TextNormalizingEstimator它會選擇性地變更大小寫、移除讀音符號、標點符號、數位，並將新文字輸出為 `outputColumnName`，以將傳入文字`inputColumnName`正規化。
ProduceHashedNgrams(TransformsCatalog+TextTransforms, String, String, Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32, Boolean)	建立 NgramHashingEstimator，它會將數據從中指定的 `inputColumnName` 數據行複製到新的數據行： `outputColumnName` 併產生哈希 n-gram 計數的向量。
ProduceHashedNgrams(TransformsCatalog+TextTransforms, String, String[], Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32, Boolean)	建立 NgramHashingEstimator，它會將數據從中指定的 `inputColumnNames` 多個數據行擷取到新的數據行： `outputColumnName` 併產生哈希 n-gram 計數的向量。
ProduceHashedWordBags(TransformsCatalog+TextTransforms, String, String, Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32)	建立 WordHashBagEstimator，它會將中指定的 `inputColumnName` 數據行對應至名為 `outputColumnName`的新數據行中哈希 n-gram 計數的向量。
ProduceHashedWordBags(TransformsCatalog+TextTransforms, String, String[], Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32)	建立 WordHashBagEstimator，它會將中指定的 `inputColumnNames` 多個數據行對應至名為 `outputColumnName`的新數據行中哈希 n-gram 計數的向量。
ProduceNgrams(TransformsCatalog+TextTransforms, String, String, Int32, Int32, Boolean, Int32, NgramExtractingEstimator+WeightingCriteria)	建立， NgramExtractingEstimator 其會產生輸入文字中所遇到的連續單字 (數序列的 n-gram 數) 向量。
ProduceWordBags(TransformsCatalog+TextTransforms, String, Char, Char, String, Int32)	建立 WordBagEstimator，它會將中指定的 `inputColumnName` 數據行對應至名為 `outputColumnName`的新數據行中 n-gram 計數的向量。
ProduceWordBags(TransformsCatalog+TextTransforms, String, String, Int32, Int32, Boolean, Int32, NgramExtractingEstimator+WeightingCriteria)	建立 WordBagEstimator，它會將中指定的 `inputColumnName` 數據行對應至名為 `outputColumnName`的新數據行中 n-gram 計數的向量。
ProduceWordBags(TransformsCatalog+TextTransforms, String, String[], Int32, Int32, Boolean, Int32, NgramExtractingEstimator+WeightingCriteria)	建立 WordBagEstimator，它會將中指定的 `inputColumnNames` 多個數據行對應至名為 `outputColumnName`的新數據行中 n-gram 計數的向量。
RemoveDefaultStopWords(TransformsCatalog+TextTransforms, String, String, StopWordsRemovingEstimator+Language)	建立 CustomStopWordsRemovingEstimator，它會將資料從中指定的 `inputColumnName` 數據行複製到新的數據行： `outputColumnName` 並從中移除預先 `language` 定義的文字集。
RemoveStopWords(TransformsCatalog+TextTransforms, String, String, String[])	建立 CustomStopWordsRemovingEstimator，它會將數據從中指定的 `inputColumnName` 數據行複製到新的數據行： `outputColumnName` 並從中移除指定的 `stopwords` 文字。
TokenizeIntoCharactersAsKeys(TransformsCatalog+TextTransforms, String, String, Boolean)	建立， TokenizingByCharactersEstimator其會使用滑動視窗將文字分割成字元序列來標記。
TokenizeIntoWords(TransformsCatalog+TextTransforms, String, String, Char[])	建立 WordTokenizingEstimator，其會使用 `separators` 做為分隔符來標記輸入文字。

適用於

共用方式為

TextCatalog 類別

定義

方法

適用於

其他資源