Partilhar via


_mm256_maddsub_ps

[Note: This document describes a pre-release version of Visual Studio 2010 SP1 and may be revised in any later version.]

Visual Studio 2010 SP1 is required.

Microsoft Specific

Generates the FMA4 YMM instruction vfmaddsubps to perform an alternating single-round floating-point multiply-add/subtract of its sources.

__m256 _mm256_maddsub_ps (
   __m256 src1,
   __m256 src2,
   __m256 src3
);

Parameters

  • [in] src1
    A 256-bit parameter that contains eight 32-bit floating-point values.

  • [in] src2
    A 256-bit parameter that contains eight 32-bit floating-point values.

  • [in] src3
    A 256-bit parameter that contains eight 32-bit floating-point values.

Return value

A 256-bit result r that contains eight 32-bit floating-point values.

r[i] := src1[i] * src2[i] - src3[i]; // i even
r[i] := src1[i] * src2[i] + src3[i]; // i odd

Requirements

Intrinsic

Architecture

_mm256_maddsub_ps

FMA4

Header file <intrin.h>

Remarks

Each of the eight single-precision floating-point values in src1 is multiplied by the corresponding value in src2. Each even-numbered source value of src3 is subtracted from its corresponding product, each odd-numbered value is added to its corresponding product, and each result is stored as the corresponding value in the destination. Each multiply-add/subtract pair is performed with a single round at the end, as if intermediate results were computed to infinite precision.

The vfmaddsubps instruction is part of the FMA4 family of instructions. Before you use this intrinsic, you must ensure that the processor supports this instruction. To determine hardware support for this instruction, call the __cpuid intrinsic with InfoType = 0x80000001 and check bit 16 of CPUInfo[2] (ECX). This bit is 1 when the instruction is supported, and 0 otherwise.

Example

#include <stdio.h>
#include <intrin.h>
int main()
{
    __m256 a, b, c, d;
    int i;
    for (i = 0; i < 8; i++) {
        a.m256_f32[i] = i;
        b.m256_f32[i] = 2.;
        c.m256_f32[i] = 3.;
    }
    d = _mm256_maddsub_ps(a, b, c);
    for (i = 0; i < 8; i++) printf_s(" %.3f", d.m256_f32[i]);
    printf_s("\n");
}
-3.000 5.000 1.000 9.000 5.000 13.000 9.000 17.000

See Also

Reference

_mm_maddsub_ps

_mm256_msubadd_ps

_mm256_maddsub_pd

__cpuid, __cpuidex

FMA4 Intrinsics Added for Visual Studio 2010 SP1

Change History

Date

History

Reason

March 2011

Added this content.

SP1 feature change.