다음을 통해 공유


NormalizationCatalog.NormalizeBinning 메서드

정의

오버로드

NormalizeBinning(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins with equal density.

NormalizeBinning(TransformsCatalog, String, String, Int64, Boolean, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins with equal density.

NormalizeBinning(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins with equal density.

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeBinning (this Microsoft.ML.TransformsCatalog catalog, Microsoft.ML.InputOutputColumnPair[] columns, long maximumExampleCount = 1000000000, bool fixZero = true, int maximumBinCount = 1024);
static member NormalizeBinning : Microsoft.ML.TransformsCatalog * Microsoft.ML.InputOutputColumnPair[] * int64 * bool * int -> Microsoft.ML.Transforms.NormalizingEstimator
<Extension()>
Public Function NormalizeBinning (catalog As TransformsCatalog, columns As InputOutputColumnPair(), Optional maximumExampleCount As Long = 1000000000, Optional fixZero As Boolean = true, Optional maximumBinCount As Integer = 1024) As NormalizingEstimator

매개 변수

catalog
TransformsCatalog

변환 카탈로그

columns
InputOutputColumnPair[]

입력 및 출력 열 쌍입니다. 입력 열은 데이터 형식이거나 해당 형식 SingleDouble 의 알려진 크기 벡터여야 합니다. 출력 열의 데이터 형식은 연결된 입력 열과 동일합니다.

maximumExampleCount
Int64

normalizer를 학습시키는 데 사용되는 최대 예제 수입니다.

fixZero
Boolean

0에서 0으로 매핑할지 여부, 스파스를 유지합니다.

maximumBinCount
Int32

최대 bin 수(권장되는 2개 전원)입니다.

반환

예제

using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using static Microsoft.ML.Transforms.NormalizingTransformer;

namespace Samples.Dynamic
{
    public class NormalizeBinningMulticolumn
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();
            var samples = new List<DataPoint>()
            {
                new DataPoint(){ Features = new float[4] { 8, 1, 3, 0},
                    Features2 = 1 },

                new DataPoint(){ Features = new float[4] { 6, 2, 2, 0},
                    Features2 = 4 },

                new DataPoint(){ Features = new float[4] { 4, 0, 1, 0},
                    Features2 = 1 },

                new DataPoint(){ Features = new float[4] { 2,-1,-1, 1},
                    Features2 = 2 }
            };
            // Convert training data to IDataView, the general data type used in
            // ML.NET.
            var data = mlContext.Data.LoadFromEnumerable(samples);
            // NormalizeBinning normalizes the data by constructing equidensity bins
            // and produce output based on to which bin the original value belongs.
            var normalize = mlContext.Transforms.NormalizeBinning(new[]{
                new InputOutputColumnPair("Features"),
                new InputOutputColumnPair("Features2"),
                },
                maximumBinCount: 4, fixZero: false);

            // Now we can transform the data and look at the output to confirm the
            // behavior of the estimator. This operation doesn't actually evaluate
            // data until we read the data below.
            var normalizeTransform = normalize.Fit(data);
            var transformedData = normalizeTransform.Transform(data);
            var column = transformedData.GetColumn<float[]>("Features").ToArray();
            var column2 = transformedData.GetColumn<float>("Features2").ToArray();

            for (int i = 0; i < column.Length; i++)
                Console.WriteLine(string.Join(", ", column[i].Select(x => x
                .ToString("f4"))) + "\t\t" + column2[i]);
            // Expected output:
            //
            //  Features                            Feature2
            //  1.0000, 0.6667, 1.0000, 0.0000          0
            //  0.6667, 1.0000, 0.6667, 0.0000          1
            //  0.3333, 0.3333, 0.3333, 0.0000          0
            //  0.0000, 0.0000, 0.0000, 1.0000          0.5
        }

        private class DataPoint
        {
            [VectorType(4)]
            public float[] Features { get; set; }

            public float Features2 { get; set; }
        }
    }
}

적용 대상

NormalizeBinning(TransformsCatalog, String, String, Int64, Boolean, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins with equal density.

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeBinning (this Microsoft.ML.TransformsCatalog catalog, string outputColumnName, string inputColumnName = default, long maximumExampleCount = 1000000000, bool fixZero = true, int maximumBinCount = 1024);
static member NormalizeBinning : Microsoft.ML.TransformsCatalog * string * string * int64 * bool * int -> Microsoft.ML.Transforms.NormalizingEstimator
<Extension()>
Public Function NormalizeBinning (catalog As TransformsCatalog, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional maximumExampleCount As Long = 1000000000, Optional fixZero As Boolean = true, Optional maximumBinCount As Integer = 1024) As NormalizingEstimator

매개 변수

catalog
TransformsCatalog

변환 카탈로그

outputColumnName
String

의 변환에서 생성된 열의 inputColumnName이름입니다. 이 열의 데이터 형식은 입력 열과 동일합니다.

inputColumnName
String

변환할 열의 이름입니다. 이 값으로 null설정하면 해당 값이 outputColumnName 원본으로 사용됩니다. 이 열의 데이터 형식은 해당 형식의 알려진 크기 벡터여야 합니다SingleDouble.

maximumExampleCount
Int64

normalizer를 학습시키는 데 사용되는 최대 예제 수입니다.

fixZero
Boolean

0에서 0으로 매핑할지 여부, 스파스를 유지합니다.

maximumBinCount
Int32

최대 bin 수(권장되는 2개 전원)입니다.

반환

예제

using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using static Microsoft.ML.Transforms.NormalizingTransformer;

namespace Samples.Dynamic
{
    public class NormalizeBinning
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, 
            // as well as the source of randomness.
            var mlContext = new MLContext();
            var samples = new List<DataPoint>()
            {
                new DataPoint(){ Features = new float[4] { 8, 1, 3, 0} },
                new DataPoint(){ Features = new float[4] { 6, 2, 2, 0} },
                new DataPoint(){ Features = new float[4] { 4, 0, 1, 0} },
                new DataPoint(){ Features = new float[4] { 2,-1,-1, 1} }
            };
            // Convert training data to IDataView, the general data type used in
            // ML.NET.
            var data = mlContext.Data.LoadFromEnumerable(samples);
            // NormalizeBinning normalizes the data by constructing equidensity bins
            // and produce output based on 
            // to which bin the original value belongs.
            var normalize = mlContext.Transforms.NormalizeBinning("Features",
                maximumBinCount: 4, fixZero: false);

            // NormalizeBinning normalizes the data by constructing equidensity bins
            // and produce output based on to which bin original value belong but
            // make sure zero values would remain zero after normalization. Helps
            // preserve sparsity.
            var normalizeFixZero = mlContext.Transforms.NormalizeBinning("Features",
                maximumBinCount: 4, fixZero: true);

            // Now we can transform the data and look at the output to confirm the
            // behavior of the estimator. This operation doesn't actually evaluate
            // data until we read the data below.
            var normalizeTransform = normalize.Fit(data);
            var transformedData = normalizeTransform.Transform(data);
            var normalizeFixZeroTransform = normalizeFixZero.Fit(data);
            var fixZeroData = normalizeFixZeroTransform.Transform(data);
            var column = transformedData.GetColumn<float[]>("Features").ToArray();
            foreach (var row in column)
                Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
                    "f4"))));
            // Expected output:
            //  1.0000, 0.6667, 1.0000, 0.0000
            //  0.6667, 1.0000, 0.6667, 0.0000
            //  0.3333, 0.3333, 0.3333, 0.0000
            //  0.0000, 0.0000, 0.0000, 1.0000

            var columnFixZero = fixZeroData.GetColumn<float[]>("Features")
                .ToArray();

            foreach (var row in columnFixZero)
                Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
                    "f4"))));
            // Expected output:
            //  1.0000, 0.3333, 1.0000, 0.0000
            //  0.6667, 0.6667, 0.6667, 0.0000
            //  0.3333, 0.0000, 0.3333, 0.0000
            //  0.0000, -0.3333, 0.0000, 1.0000

            // Let's get transformation parameters. Since we work with only one
            // column we need to pass 0 as parameter for
            // GetNormalizerModelParameters. If we have multiple columns
            // transformations we need to pass index of InputOutputColumnPair.
            var transformParams = normalizeTransform.GetNormalizerModelParameters(0)
                as BinNormalizerModelParameters<ImmutableArray<float>>;

            var density = transformParams.Density[0];
            var offset = (transformParams.Offset.Length == 0 ? 0 : transformParams
                .Offset[0]);

            Console.WriteLine($"The 0-index value in resulting array would be " +
                $"produce by: y = (Index(x) / {density}) - {offset}");

            Console.WriteLine("Where Index(x) is the index of the bin to which " +
                "x belongs");

            Console.WriteLine("Bins upper bounds are: " + string.Join(" ",
                transformParams.UpperBounds[0]));
            // Expected output:
            //  The 0-index value in resulting array would be produce by: y = (Index(x) / 3) - 0
            //  Where Index(x) is the index of the bin to which x belongs
            //  Bins upper bounds are: 3 5 7 ∞

            var fixZeroParams = (normalizeFixZeroTransform
                .GetNormalizerModelParameters(0) as BinNormalizerModelParameters<
                ImmutableArray<float>>);

            density = fixZeroParams.Density[1];
            offset = (fixZeroParams.Offset.Length == 0 ? 0 : fixZeroParams
                .Offset[1]);

            Console.WriteLine($"The 0-index value in resulting array would be " +
                $"produce by: y = (Index(x) / {density}) - {offset}");

            Console.WriteLine("Where Index(x) is the index of the bin to which x " +
                "belongs");

            Console.WriteLine("Bins upper bounds are: " + string.Join(" ",
                fixZeroParams.UpperBounds[1]));
            // Expected output:
            //  The 0-index value in resulting array would be produce by: y = (Index(x) / 3) - 0.3333333
            //  Where Index(x) is the index of the bin to which x belongs
            //  Bins upper bounds are: -0.5 0.5 1.5 ∞
        }

        private class DataPoint
        {
            [VectorType(4)]
            public float[] Features { get; set; }
        }
    }
}

적용 대상