struttura DML_ROI_ALIGN_GRAD_OPERATOR_DESC (directml.h)

Articolo
07/16/2024

Calcola le sfumature backpropagation per ROI_ALIGN e ROI_ALIGN1.

Tenere presente che DML_ROI_ALIGN1_OPERATOR_DESC ritagliare e ridimensionare le sottoregioni di un tensore di input usando il campionamento neareast-neighbor o l'interpolazione bilineare. Dato un con le stesse dimensioni dell'output di un DML_OPERATOR_ROI_ALIGN1equivalente, questo operatore produce un con le stesse dimensioni del di input di DML_OPERATOR_ROI_ALIGN1.

Si consideri ad esempio un DML_OPERATOR_ROI_ALIGN1 che esegue un vicino più vicino ridimensionamento di 1,5x nella larghezza e 0,5x nell'altezza, per 4 ritagli non sovrapposti di un input con dimensioni [1, 1, 4, 4]:

ROITensor
[[0, 0, 2, 2],
 [2, 0, 4, 2],
 [0, 2, 2, 4],
 [2, 2, 4, 4]]

BatchIndicesTensor
[0, 0, 0, 0]

InputTensor
[[[[1,   2, |  3,  4],    RoiAlign1     [[[[ 1,  1,  2]]],
   [5,   6, |  7,  8],       -->         [[[ 3,  3,  4]]],
   ------------------                    [[[ 9,  9, 10]]],
   [9,  10, | 11, 12],                   [[[11, 11, 12]]]]
   [13, 14, | 15, 16]]]]

Si noti che il 0° elemento di ogni area contribuisce a due elementi nell'output, ovvero il primo elemento contribuisce a un elemento nell'output e i 2 e i 3° elemento contribuiscono a nessun elemento dell'output.

Il DML_OPERATOR_ROI_ALIGN_GRAD corrispondente eseguirà le operazioni seguenti:

InputGradientTensor                  OutputGradientTensor
[[[[ 1,  2,  3]]],    ROIAlignGrad   [[[[ 3,  3, |  9,  6],
 [[[ 4,  5,  6]]],         -->          [ 0,  0, |  0,  0],
 [[[ 7,  8,  9]]],                      ------------------
 [[[10, 11, 12]]]]                      [15,  9, | 21, 12],
                                        [ 0,  0, |  0,  0]]]]

In sintesi, DML_OPERATOR_ROI_ALIGN_GRAD si comporta in modo analogo a un DML_OPERATOR_RESAMPLE_GRAD eseguito su ogni batch del InputGradientTensor quando le aree non si sovrappongono.

Per OutputROIGradientTensor, la matematica è leggermente diversa e può essere riepilogata dallo pseudocode seguente (presupponendo che MinimumSamplesPerOutput == 1 e MaximumSamplesPerOutput == 1):

for each region of interest (ROI):
    for each inputGradientCoordinate:
        for each inputCoordinate that contributed to this inputGradient element:
            topYIndex = floor(inputCoordinate.y)
            bottomYIndex = ceil(inputCoordinate.y)
            leftXIndex = floor(inputCoordinate.x)
            rightXIndex = ceil(inputCoordinate.x)

            yLerp = inputCoordinate.y - topYIndex
            xLerp = inputCoordinate.x - leftXIndex

            topLeft = InputTensor[topYIndex][leftXIndex]
            topRight = InputTensor[topYIndex][rightXIndex]
            bottomLeft = InputTensor[bottomYIndex][leftXIndex]
            bottomRight = InputTensor[bottomYIndex][rightXIndex]

            inputGradientWeight = InputGradientTensor[inputGradientCoordinate.y][inputGradientCoordinate.x]
            imageGradY = (1 - xLerp) * (bottomLeft - topLeft) + xLerp * (bottomRight - topRight)
            imageGradX = (1 - yLerp) * (topRight - topLeft) + yLerp * (bottomRight - bottomLeft)

            imageGradY *= inputGradientWeight
            imageGradX *= inputGradientWeight

            OutputROIGradientTensor[roiIndex][0] += imageGradX * (inputWidth - inputGradientCoordinate.x)
            OutputROIGradientTensor[roiIndex][1] += imageGradY * (inputHeight - inputGradientCoordinate.y)
            OutputROIGradientTensor[roiIndex][2] += imageGradX * inputGradientCoordinate.x
            OutputROIGradientTensor[roiIndex][3] += imageGradY * inputGradientCoordinate.y

OutputGradientTensor o OutputROIGradientTensor possono essere omessi se ne è necessaria una sola; ma almeno uno deve essere fornito.

Sintassi

struct DML_ROI_ALIGN_GRAD_OPERATOR_DESC {
  const DML_TENSOR_DESC  *InputTensor;
  const DML_TENSOR_DESC  *InputGradientTensor;
  const DML_TENSOR_DESC  *ROITensor;
  const DML_TENSOR_DESC  *BatchIndicesTensor;
  const DML_TENSOR_DESC  *OutputGradientTensor;
  const DML_TENSOR_DESC  *OutputROIGradientTensor;
  DML_REDUCE_FUNCTION    ReductionFunction;
  DML_INTERPOLATION_MODE InterpolationMode;
  FLOAT                  SpatialScaleX;
  FLOAT                  SpatialScaleY;
  FLOAT                  InputPixelOffset;
  FLOAT                  OutputPixelOffset;
  UINT                   MinimumSamplesPerOutput;
  UINT                   MaximumSamplesPerOutput;
  BOOL                   AlignRegionsToCorners;
};

Membri

InputTensor

Tipo: _Maybenull_ const DML_TENSOR_DESC*

Tensore contenente i dati di input dal passaggio in avanti con dimensioni { BatchCount, ChannelCount, InputHeight, InputWidth }. Questo tensore deve essere fornito quando OutputROIGradientTensor viene fornito o quando ReductionFunction == DML_REDUCE_FUNCTION_MAX. Si tratta dello stesso tensore che verrebbe fornito a InputTensor per DML_OPERATOR_ROI_ALIGN o DML_OPERATOR_ROI_ALIGN1.

InputGradientTensor

Tipo: const DML_TENSOR_DESC*

ROITensor

Tipo: const DML_TENSOR_DESC*

Tensore contenente i dati di interesse (ROI), ovvero una serie di rettangoli delimitatori in coordinate a virgola mobile che puntano alle dimensioni X e Y del tensore di input. Le dimensioni consentite di ROITensor sono { NumROIs, 4 }, { 1, NumROIs, 4 }o { 1, 1, NumROIs, 4 }. Per ogni ROI, i valori saranno le coordinate degli angoli superiore sinistro e inferiore destro nell'ordine [x1, y1, x2, y2]. Le aree possono essere vuote, vale a dire che tutti i pixel di output provengono dalla singola coordinata di input e le aree possono essere invertite (ad esempio, x2 minore di x1), ovvero che l'output riceve una versione con mirroring/capovolto dell'input. Queste coordinate vengono prima ridimensionate da SpatialScaleX e SpatialScaleY, ma se sono entrambe 1,0, i rettangoli dell'area corrispondono semplicemente direttamente alle coordinate del tensore di input. Si tratta dello stesso tensore che verrebbe fornito a ROITensor per DML_OPERATOR_ROI_ALIGN o DML_OPERATOR_ROI_ALIGN1.

BatchIndicesTensor

Tipo: const DML_TENSOR_DESC*

Tensore contenente gli indici batch da cui estrarre le istanze roi. Le dimensioni consentite di BatchIndicesTensor sono { NumROIs }, { 1, NumROIs }, { 1, 1, NumROIs }o { 1, 1, 1, NumROIs }. Ogni valore è l'indice di un batch da InputTensor. Il comportamento non è definito se i valori non si trovano nell'intervallo [0, BatchCount). Si tratta dello stesso tensore che verrebbe fornito a BatchIndicesTensor per DML_OPERATOR_ROI_ALIGN o DML_OPERATOR_ROI_ALIGN1.

OutputGradientTensor

Tipo: _Maybenull_ const DML_TENSOR_DESC*

Tensore di output contenente le sfumature backpropagate rispetto a InputTensor. In genere questo tensore avrà le stesse dimensioni del input del DML_OPERATOR_ROI_ALIGN1 corrispondente nel passaggio in avanti. Se OutputROIGradientTensor non viene fornito, è necessario specificare OutputGradientTensor.

OutputROIGradientTensor

Tipo: _Maybenull_ const DML_TENSOR_DESC*

Tensore di output contenente le sfumature backpropagate rispetto a ROITensor. Questo tensore deve avere le stesse dimensioni di ROITensor. Se OutputGradientTensor non viene fornito, è necessario specificare OutputROIGradientTensor.

ReductionFunction

Tipo: DML_REDUCE_FUNCTION

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::ReductionFunction.

InterpolationMode

Tipo: DML_INTERPOLATION_MODE

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::InterpolationMode.

SpatialScaleX

Tipo: FLOAT

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::SpatialScaleX.

SpatialScaleY

Tipo: FLOAT

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::SpatialScaleY.

InputPixelOffset

Tipo: FLOAT

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::InputPixelOffset.

OutputPixelOffset

Tipo: FLOAT

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::OutputPixelOffset.

MinimumSamplesPerOutput

Tipo: UINT

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::MinimumSamplesPerOutput.

MaximumSamplesPerOutput

Tipo: UINT

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::MaximumSamplesPerOutput.

AlignRegionsToCorners

Tipo: BOOL

Vedere DML_ROI_ALIGN1_OPERATOR_DESC::AlignRegionsToCorners.

Osservazioni

Disponibilità

Questo operatore è stato introdotto in DML_FEATURE_LEVEL_4_1.

Vincoli tensor

InputGradientTensor, InputTensor, OutputGradientTensor, OutputROIGradientTensore ROITensor devono avere lo stesso DataType.

Supporto tensor

DML_FEATURE_LEVEL_4_1 e versioni successive

Tensore	Gentile	Conteggi delle dimensioni supportati	Tipi di dati supportati
InputTensor	Input facoltativo	4	FLOAT32, FLOAT16
InputGradientTensor	Immissione	4	FLOAT32, FLOAT16
ROITensor	Immissione	da 2 a 4	FLOAT32, FLOAT16
BatchIndicesTensor	Immissione	Da 1 a 4	UINT32
OutputGradientTensor	Output facoltativo	4	FLOAT32, FLOAT16
OutputROIGradientTensor	Output facoltativo	da 2 a 4	FLOAT32, FLOAT16

Fabbisogno

Requisito	Valore
intestazione	directml.h

Condividi tramite

struttura DML_ROI_ALIGN_GRAD_OPERATOR_DESC (directml.h)

Sintassi

Membri

Osservazioni

Disponibilità

Vincoli tensor

Supporto tensor

DML_FEATURE_LEVEL_4_1 e versioni successive

Fabbisogno

Commenti e suggerimenti

Risorse aggiuntive