PcaCatalog.RandomizedPca 메서드
정의
중요
일부 정보는 릴리스되기 전에 상당 부분 수정될 수 있는 시험판 제품과 관련이 있습니다. Microsoft는 여기에 제공된 정보에 대해 어떠한 명시적이거나 묵시적인 보증도 하지 않습니다.
오버로드
RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, RandomizedPcaTrainer+Options) |
SVD(임의 단수 값 분해) 알고리즘을 사용하여 대략적인 PCA(주 구성 요소 분석) 모델을 학습하는 고급 옵션을 사용하여 만듭니 RandomizedPcaTrainer 다. |
RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, String, String, Int32, Int32, Boolean, Nullable<Int32>) |
임의 SVD(단수 값 분해) 알고리즘을 사용하여 대략적인 PCA(주 구성 요소 분석) 모델을 학습하는 만들기 RandomizedPcaTrainer |
RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, RandomizedPcaTrainer+Options)
SVD(임의 단수 값 분해) 알고리즘을 사용하여 대략적인 PCA(주 구성 요소 분석) 모델을 학습하는 고급 옵션을 사용하여 만듭니 RandomizedPcaTrainer 다.
public static Microsoft.ML.Trainers.RandomizedPcaTrainer RandomizedPca (this Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers catalog, Microsoft.ML.Trainers.RandomizedPcaTrainer.Options options);
static member RandomizedPca : Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers * Microsoft.ML.Trainers.RandomizedPcaTrainer.Options -> Microsoft.ML.Trainers.RandomizedPcaTrainer
<Extension()>
Public Function RandomizedPca (catalog As AnomalyDetectionCatalog.AnomalyDetectionTrainers, options As RandomizedPcaTrainer.Options) As RandomizedPcaTrainer
매개 변수
변칙 검색 카탈로그 트레이너 개체입니다.
- options
- RandomizedPcaTrainer.Options
알고리즘에 대한 고급 옵션입니다.
반환
예제
using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
namespace Samples.Dynamic.Trainers.AnomalyDetection
{
public static class RandomizedPcaSampleWithOptions
{
public static void Example()
{
// Create a new context for ML.NET operations. It can be used for
// exception tracking and logging, as a catalog of available operations
// and as the source of randomness. Setting the seed to a fixed number
// in this example to make outputs deterministic.
var mlContext = new MLContext(seed: 0);
// Training data.
var samples = new List<DataPoint>()
{
new DataPoint(){ Features = new float[3] {0, 2, 1} },
new DataPoint(){ Features = new float[3] {0, 2, 3} },
new DataPoint(){ Features = new float[3] {0, 2, 4} },
new DataPoint(){ Features = new float[3] {0, 2, 1} },
new DataPoint(){ Features = new float[3] {0, 2, 2} },
new DataPoint(){ Features = new float[3] {0, 2, 3} },
new DataPoint(){ Features = new float[3] {0, 2, 4} },
new DataPoint(){ Features = new float[3] {1, 0, 0} }
};
// Convert the List<DataPoint> to IDataView, a consumable format to
// ML.NET functions.
var data = mlContext.Data.LoadFromEnumerable(samples);
var options = new Microsoft.ML.Trainers.RandomizedPcaTrainer.Options()
{
FeatureColumnName = nameof(DataPoint.Features),
Rank = 1,
Seed = 10,
};
// Create an anomaly detector. Its underlying algorithm is randomized
// PCA.
var pipeline = mlContext.AnomalyDetection.Trainers.RandomizedPca(
options);
// Train the anomaly detector.
var model = pipeline.Fit(data);
// Apply the trained model on the training data.
var transformed = model.Transform(data);
// Read ML.NET predictions into IEnumerable<Result>.
var results = mlContext.Data.CreateEnumerable<Result>(transformed,
reuseRowObject: false).ToList();
// Let's go through all predictions.
for (int i = 0; i < samples.Count; ++i)
{
// The i-th example's prediction result.
var result = results[i];
// The i-th example's feature vector in text format.
var featuresInText = string.Join(',', samples[i].Features);
if (result.PredictedLabel)
// The i-th sample is predicted as an outlier.
Console.WriteLine("The {0}-th example with features [{1}] is" +
"an outlier with a score of being outlier {2}", i,
featuresInText, result.Score);
else
// The i-th sample is predicted as an inlier.
Console.WriteLine("The {0}-th example with features [{1}] is" +
"an inlier with a score of being outlier {2}",
i, featuresInText, result.Score);
}
// Lines printed out should be
// The 0 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.2264826
// The 1 - th example with features[0, 2, 3] is an inlier with a score of being outlier 0.1739471
// The 2 - th example with features[0, 2, 4] is an inlier with a score of being outlier 0.05711612
// The 3 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.2264826
// The 4 - th example with features[0, 2, 2] is an inlier with a score of being outlier 0.3868995
// The 5 - th example with features[0, 2, 3] is an inlier with a score of being outlier 0.1739471
// The 6 - th example with features[0, 2, 4] is an inlier with a score of being outlier 0.05711612
// The 7 - th example with features[1, 0, 0] is an outlier with a score of being outlier 0.6260795
}
// Example with 3 feature values. A training data set is a collection of
// such examples.
private class DataPoint
{
[VectorType(3)]
public float[] Features { get; set; }
}
// Class used to capture prediction of DataPoint.
private class Result
{
// Outlier gets true while inlier has false.
public bool PredictedLabel { get; set; }
// Inlier gets smaller score. Score is between 0 and 1.
public float Score { get; set; }
}
}
}
설명
기본적으로 예측 점수에 따라 데이터 요소의 레이블을 결정하는 데 사용되는 임계값은 0.5입니다. 점수 범위는 0에서 1까지입니다. 예측 점수가 0.5보다 높은 데이터 요소는 이상값으로 간주됩니다. 이 임계값을 변경하는 데 사용합니다 ChangeModelThreshold<TModel>(AnomalyPredictionTransformer<TModel>, Single) .
적용 대상
RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, String, String, Int32, Int32, Boolean, Nullable<Int32>)
임의 SVD(단수 값 분해) 알고리즘을 사용하여 대략적인 PCA(주 구성 요소 분석) 모델을 학습하는 만들기 RandomizedPcaTrainer
public static Microsoft.ML.Trainers.RandomizedPcaTrainer RandomizedPca (this Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers catalog, string featureColumnName = "Features", string exampleWeightColumnName = default, int rank = 20, int oversampling = 20, bool ensureZeroMean = true, int? seed = default);
static member RandomizedPca : Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers * string * string * int * int * bool * Nullable<int> -> Microsoft.ML.Trainers.RandomizedPcaTrainer
<Extension()>
Public Function RandomizedPca (catalog As AnomalyDetectionCatalog.AnomalyDetectionTrainers, Optional featureColumnName As String = "Features", Optional exampleWeightColumnName As String = Nothing, Optional rank As Integer = 20, Optional oversampling As Integer = 20, Optional ensureZeroMean As Boolean = true, Optional seed As Nullable(Of Integer) = Nothing) As RandomizedPcaTrainer
매개 변수
변칙 검색 카탈로그 트레이너 개체입니다.
- rank
- Int32
PCA의 구성 요소 수입니다.
- oversampling
- Int32
임의 PCA 학습에 대한 오버샘플링 매개 변수입니다.
- ensureZeroMean
- Boolean
사용하도록 설정하면 데이터의 중심이 0 평균이 됩니다.
반환
예제
using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
namespace Samples.Dynamic.Trainers.AnomalyDetection
{
public static class RandomizedPcaSample
{
public static void Example()
{
// Create a new context for ML.NET operations. It can be used for except
// ion tracking and logging, as a catalog of available operations and as
// the source of randomness. Setting the seed to a fixed number in this
// example to make outputs deterministic.
var mlContext = new MLContext(seed: 0);
// Training data.
var samples = new List<DataPoint>()
{
new DataPoint(){ Features = new float[3] {0, 2, 1} },
new DataPoint(){ Features = new float[3] {0, 2, 1} },
new DataPoint(){ Features = new float[3] {0, 2, 1} },
new DataPoint(){ Features = new float[3] {0, 1, 2} },
new DataPoint(){ Features = new float[3] {0, 2, 1} },
new DataPoint(){ Features = new float[3] {2, 0, 0} }
};
// Convert the List<DataPoint> to IDataView, a consumable format to
// ML.NET functions.
var data = mlContext.Data.LoadFromEnumerable(samples);
// Create an anomaly detector. Its underlying algorithm is randomized
// PCA.
var pipeline = mlContext.AnomalyDetection.Trainers.RandomizedPca(
featureColumnName: nameof(DataPoint.Features), rank: 1,
ensureZeroMean: false);
// Train the anomaly detector.
var model = pipeline.Fit(data);
// Apply the trained model on the training data.
var transformed = model.Transform(data);
// Read ML.NET predictions into IEnumerable<Result>.
var results = mlContext.Data.CreateEnumerable<Result>(transformed,
reuseRowObject: false).ToList();
// Let's go through all predictions.
for (int i = 0; i < samples.Count; ++i)
{
// The i-th example's prediction result.
var result = results[i];
// The i-th example's feature vector in text format.
var featuresInText = string.Join(',', samples[i].Features);
if (result.PredictedLabel)
// The i-th sample is predicted as an outlier.
Console.WriteLine("The {0}-th example with features [{1}] is " +
"an outlier with a score of being inlier {2}", i,
featuresInText, result.Score);
else
// The i-th sample is predicted as an inlier.
Console.WriteLine("The {0}-th example with features [{1}] is " +
"an inlier with a score of being inlier {2}", i,
featuresInText, result.Score);
}
// Lines printed out should be
// The 0 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
// The 1 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
// The 2 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
// The 3 - th example with features[0, 1, 2] is an outlier with a score of being outlier 0.5082728
// The 4 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
// The 5 - th example with features[2, 0, 0] is an outlier with a score of being outlier 1
}
// Example with 3 feature values. A training data set is a collection of
// such examples.
private class DataPoint
{
[VectorType(3)]
public float[] Features { get; set; }
}
// Class used to capture prediction of DataPoint.
private class Result
{
// Outlier gets true while inlier has false.
public bool PredictedLabel { get; set; }
// Inlier gets smaller score. Score is between 0 and 1.
public float Score { get; set; }
}
}
}
설명
기본적으로 예측 점수에 따라 데이터 요소의 레이블을 결정하는 데 사용되는 임계값은 0.5입니다. 점수 범위는 0에서 1까지입니다. 예측 점수가 0.5보다 높은 데이터 요소는 이상값으로 간주됩니다. 이 임계값을 변경하는 데 사용합니다 ChangeModelThreshold<TModel>(AnomalyPredictionTransformer<TModel>, Single) .