DML_MATRIX_MULTIPLY_INTEGER_TO_FLOAT_OPERATOR_DESC structure (directml.h)

Article
02/10/2025

Performs a matrix multiplication function on integer input tensor data, and produces floating point output.

This operator requires the matrix multiply input tensors to be 4D, which are formatted as { BatchCount, ChannelCount, Height, Width }. The matrix multiply operator will perform BatchCount * ChannelCount number of independent matrix multiplications.

For example, if ATensor has Sizes of { BatchCount, ChannelCount, M, K }, and BTensor has Sizes of { BatchCount, ChannelCount, K, N }, and OutputTensor has Sizes of { BatchCount, ChannelCount, M, N }, then the matrix multiply operator will perform BatchCount * ChannelCount independent matrix multiplications of dimensions {M,K} x {K,N} = {M,N}.

Important

This API is available as part of the DirectML standalone redistributable package (see Microsoft.AI.DirectML version 1.13 and later. Also see DirectML version history.

Syntax

struct DML_MATRIX_MULTIPLY_INTEGER_TO_FLOAT_OPERATOR_DESC
{
    const DML_TENSOR_DESC* ATensor;
    const DML_TENSOR_DESC* AScaleTensor;
    _Maybenull_ const DML_TENSOR_DESC* AZeroPointTensor;
    const DML_TENSOR_DESC* BTensor;
    const DML_TENSOR_DESC* BScaleTensor;
    _Maybenull_ const DML_TENSOR_DESC* BZeroPointTensor;
    _Maybenull_ const DML_TENSOR_DESC* BiasTensor;
    const DML_TENSOR_DESC* OutputTensor;
};

Members

ATensor

Type: const DML_TENSOR_DESC*

A tensor containing the A data. This tensor's dimensions should be { BatchCount, ChannelCount, M, K }.

AScaleTensor

Type: const DML_TENSOR_DESC*

A tensor containing the ATensor scale data. The expected dimensions of AScaleTensor are { 1, 1, 1, 1 } if per-tensor quantization is required, or { 1, 1, M, 1 } if per-row quantization is required. These scale values are used for dequantizing the ATensor values.

AZeroPointTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

An optional tensor containing the ATensor zero point data. The expected dimensions of AZeroPointTensor are { 1, 1, 1, 1 } if per-tensor quantization is required, or { 1, 1, M, 1 } if per-row quantization is required. These zero point values are used for dequantizing the ATensor values.

BTensor

Type: const DML_TENSOR_DESC*

A tensor containing the B data. This tensor's dimensions should be { BatchCount, ChannelCount, K, N }.

BScaleTensor

Type: const DML_TENSOR_DESC*

A tensor containing the BTensor scale data. The expected dimensions of BScaleTensor are { 1, 1, 1, 1 } if per-tensor quantization is required, or { 1, 1, 1, N } if per-column quantization is required. These scale values are used for dequantizing the BTensor values.

BZeroPointTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

An optional tensor containing the BTensor zero point data. The expected dimensions of BZeroPointTensor are { 1, 1, 1, 1 } if per-tensor quantization is required, or { 1, 1, 1, N } if per-column quantization is required. These zero point values are used for dequantizing the BTensor values.

OutputScaleTensor

Type: const DML_TENSOR_DESC*

A tensor containing the OutputTensor scale data. The expected dimensions of OutputScaleTensor are { 1, 1, 1, 1 } if per-tensor quantization is required, or { 1, 1, M, 1 } if per-row quantization is required. This scale value is used for dequantizing the OutputTensor values.

OutputZeroPointTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

An optional tensor containing the OutputTensor zero point data. The expected dimensions of OutputZeroPointTensor are { 1, 1, 1, 1 } if per-tensor quantization is required, or { 1, 1, M, 1 } if per-row quantization is required. This zero point value is used for dequantizing the OutputTensor values.

BiasTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

An optional tensor containing the bias data. If provided, this tensor's Sizes should match output size { BatchCount, ChannelCount, M, N }.

OutputTensor

Type: const DML_TENSOR_DESC*

A tensor with which to write the results to. This tensor's dimensions are { BatchCount, ChannelCount, M, N }.

Availability

This operator was introduced in DML_FEATURE_LEVEL_6_2.

Tensor constraints

AScaleTensor, AZeroPointTensor, BScaleTensor, and BZeroPointTensor must have the same DimensionCount.
ATensor, BiasTensor, BTensor, and OutputTensor must have the same DimensionCount.
BiasTensor and OutputTensor must have the same Sizes.
ATensor, AZeroPointTensor, BTensor, and BZeroPointTensor must have the same DataType.
AScaleTensor, BiasTensor, BScaleTensor, and OutputTensor must have the same DataType.

Tensor support

Tensor	Kind	Dimensions	Supported dimension counts	Supported data types
ATensor	Input	{ [BatchCount], [ChannelCount], M, K }	2 to 4	INT32, INT16, INT8, UINT32, UINT16, UINT8
AScaleTensor	Input	{ AScaleDimensions[] }	1 to 4	FLOAT32, FLOAT16
AZeroPointTensor	Optional input	{ [1], [1], AZeroPointCount, [1] }	1 to 4	INT32, INT16, INT8, UINT32, UINT16, UINT8
BTensor	Input	{ [BatchCount], [ChannelCount], K, N }	2 to 4	INT32, INT16, INT8, UINT32, UINT16, UINT8
BScaleTensor	Input	{ BScaleDimensions[] }	1 to 4	FLOAT32, FLOAT16
BZeroPointTensor	Optional input	{ [1], [1], [1], BZeroPointCount }	1 to 4	INT32, INT16, INT8, UINT32, UINT16, UINT8
BiasTensor	Optional input	{ [BatchCount], [ChannelCount], M, N }	2 to 4	FLOAT32, FLOAT16
OutputTensor	Output	{ [BatchCount], [ChannelCount], M, N }	2 to 4	FLOAT32, FLOAT16

Requirements


Header	directml.h

Share via

DML_MATRIX_MULTIPLY_INTEGER_TO_FLOAT_OPERATOR_DESC structure (directml.h)

Syntax

Members

Availability

Tensor constraints

Tensor support

Requirements

Feedback

Additional resources