DML_MATRIX_MULTIPLY_INTEGER_TO_FLOAT_OPERATOR_DESC structure (directml.h)
Performs a matrix multiplication function on integer input tensor data, and produces floating point output.
This operator requires the matrix multiply input tensors to be 4D, which are formatted as { BatchCount, ChannelCount, Height, Width }
. The matrix multiply operator will perform BatchCount * ChannelCount number of independent matrix multiplications.
For example, if ATensor has Sizes of { BatchCount, ChannelCount, M, K }
, and BTensor has Sizes of { BatchCount, ChannelCount, K, N }
, and OutputTensor has Sizes of { BatchCount, ChannelCount, M, N }
, then the matrix multiply operator will perform BatchCount * ChannelCount independent matrix multiplications of dimensions {M,K} x {K,N} = {M,N}.
Important
This API is available as part of the DirectML standalone redistributable package (see Microsoft.AI.DirectML version 1.13 and later. Also see DirectML version history.
Syntax
struct DML_MATRIX_MULTIPLY_INTEGER_TO_FLOAT_OPERATOR_DESC
{
const DML_TENSOR_DESC* ATensor;
const DML_TENSOR_DESC* AScaleTensor;
_Maybenull_ const DML_TENSOR_DESC* AZeroPointTensor;
const DML_TENSOR_DESC* BTensor;
const DML_TENSOR_DESC* BScaleTensor;
_Maybenull_ const DML_TENSOR_DESC* BZeroPointTensor;
_Maybenull_ const DML_TENSOR_DESC* BiasTensor;
const DML_TENSOR_DESC* OutputTensor;
};
Members
ATensor
Type: const DML_TENSOR_DESC*
A tensor containing the A data. This tensor's dimensions should be { BatchCount, ChannelCount, M, K }
.
AScaleTensor
Type: const DML_TENSOR_DESC*
A tensor containing the ATensor scale data. The expected dimensions of AScaleTensor are { 1, 1, 1, 1 }
if per-tensor quantization is required, or { 1, 1, M, 1 }
if per-row quantization is required. These scale values are used for dequantizing the ATensor values.
AZeroPointTensor
Type: _Maybenull_ const DML_TENSOR_DESC*
An optional tensor containing the ATensor zero point data. The expected dimensions of AZeroPointTensor are { 1, 1, 1, 1 }
if per-tensor quantization is required, or { 1, 1, M, 1 }
if per-row quantization is required. These zero point values are used for dequantizing the ATensor values.
BTensor
Type: const DML_TENSOR_DESC*
A tensor containing the B data. This tensor's dimensions should be { BatchCount, ChannelCount, K, N }
.
BScaleTensor
Type: const DML_TENSOR_DESC*
A tensor containing the BTensor scale data. The expected dimensions of BScaleTensor are { 1, 1, 1, 1 }
if per-tensor quantization is required, or { 1, 1, 1, N }
if per-column quantization is required. These scale values are used for dequantizing the BTensor values.
BZeroPointTensor
Type: _Maybenull_ const DML_TENSOR_DESC*
An optional tensor containing the BTensor zero point data. The expected dimensions of BZeroPointTensor are { 1, 1, 1, 1 }
if per-tensor quantization is required, or { 1, 1, 1, N }
if per-column quantization is required. These zero point values are used for dequantizing the BTensor values.
OutputScaleTensor
Type: const DML_TENSOR_DESC*
A tensor containing the OutputTensor scale data. The expected dimensions of OutputScaleTensor are { 1, 1, 1, 1 }
if per-tensor quantization is required, or { 1, 1, M, 1 }
if per-row quantization is required. This scale value is used for dequantizing the OutputTensor values.
OutputZeroPointTensor
Type: _Maybenull_ const DML_TENSOR_DESC*
An optional tensor containing the OutputTensor zero point data. The expected dimensions of OutputZeroPointTensor are { 1, 1, 1, 1 }
if per-tensor quantization is required, or { 1, 1, M, 1 }
if per-row quantization is required. This zero point value is used for dequantizing the OutputTensor values.
BiasTensor
Type: _Maybenull_ const DML_TENSOR_DESC*
An optional tensor containing the bias data. If provided, this tensor's Sizes should match output size { BatchCount, ChannelCount, M, N }
.
OutputTensor
Type: const DML_TENSOR_DESC*
A tensor with which to write the results to. This tensor's dimensions are { BatchCount, ChannelCount, M, N }
.
Availability
This operator was introduced in DML_FEATURE_LEVEL_6_2.
Tensor constraints
- AScaleTensor, AZeroPointTensor, BScaleTensor, and BZeroPointTensor must have the same DimensionCount.
- ATensor, BiasTensor, BTensor, and OutputTensor must have the same DimensionCount.
- BiasTensor and OutputTensor must have the same Sizes.
- ATensor, AZeroPointTensor, BTensor, and BZeroPointTensor must have the same DataType.
- AScaleTensor, BiasTensor, BScaleTensor, and OutputTensor must have the same DataType.
Tensor support
Tensor | Kind | Dimensions | Supported dimension counts | Supported data types |
---|---|---|---|---|
ATensor | Input | { [BatchCount], [ChannelCount], M, K } | 2 to 4 | INT32, INT16, INT8, UINT32, UINT16, UINT8 |
AScaleTensor | Input | { AScaleDimensions[] } | 1 to 4 | FLOAT32, FLOAT16 |
AZeroPointTensor | Optional input | { [1], [1], AZeroPointCount, [1] } | 1 to 4 | INT32, INT16, INT8, UINT32, UINT16, UINT8 |
BTensor | Input | { [BatchCount], [ChannelCount], K, N } | 2 to 4 | INT32, INT16, INT8, UINT32, UINT16, UINT8 |
BScaleTensor | Input | { BScaleDimensions[] } | 1 to 4 | FLOAT32, FLOAT16 |
BZeroPointTensor | Optional input | { [1], [1], [1], BZeroPointCount } | 1 to 4 | INT32, INT16, INT8, UINT32, UINT16, UINT8 |
BiasTensor | Optional input | { [BatchCount], [ChannelCount], M, N } | 2 to 4 | FLOAT32, FLOAT16 |
OutputTensor | Output | { [BatchCount], [ChannelCount], M, N } | 2 to 4 | FLOAT32, FLOAT16 |
Requirements
Header | directml.h |