Upravit

Sdílet prostřednictvím


DML_DYNAMIC_QUANTIZE_LINEAR_OPERATOR_DESC structure (directml.h)

Calculates the quantization scale and zero point values necessary to quantize the InputTensor, then applies that quantization, writing the result to OutputTensor.

This operator uses the following equation to quantize.

InputMax = Max(InputTensor)
InputMin = Min(InputTensor)

AValue = (A - AZeroPoint) * AScale
BValue = (B - BZeroPoint) * BScale

// For uint8 output, Min = 0, Max = 255
// For int8 output, Min = -128, Max = 127
OutputScale = (InputMax – InputMin) / (Max – Min)

OutputZeroPoint = Min - InputMin / OutputScale

OutputTensor = clamp(round(InputValue / OutputScale) + OutputZeroPoint, Min, Max)

Syntax

struct DML_DYNAMIC_QUANTIZE_LINEAR_OPERATOR_DESC {
  const DML_TENSOR_DESC *InputTensor;
  const DML_TENSOR_DESC *OutputTensor;
  const DML_TENSOR_DESC *OutputScaleTensor;
  const DML_TENSOR_DESC *OutputZeroPointTensor;
};

Members

InputTensor

Type: const DML_TENSOR_DESC*

The tensor containing the inputs.

OutputTensor

Type: const DML_TENSOR_DESC*

The output tensor to write the results to.

OutputScaleTensor

Type: const DML_TENSOR_DESC*

The output tensor to write the output scale factor for OutputTensor. The expected number of elements in OutputScaleTensor is 1.

OutputZeroPointTensor

Type: const DML_TENSOR_DESC*

The output tensor to write the output zero point for OutputTensor. The expected number of elements in OutputZeroPointTensor is 1.

Remarks

Availability

This operator was introduced in DML_FEATURE_LEVEL_4_0.

Tensor constraints

  • InputTensor, OutputScaleTensor, OutputTensor, and OutputZeroPointTensor must have the same DimensionCount.
  • OutputTensor and OutputZeroPointTensor must have the same DataType.

Tensor support

Tensor Kind Supported dimension counts Supported data types
InputTensor Input 1 to 8 FLOAT32, FLOAT16
OutputTensor Output 1 to 8 INT8, UINT8
OutputScaleTensor Output 1 to 8 FLOAT32
OutputZeroPointTensor Output 1 to 8 INT8, UINT8

Requirements

Requirement Value
Minimum supported client Windows Build 22000
Minimum supported server Windows Build 22000
Header directml.h

See also