OpenMP 指令

项目
10/18/2023

提供 OpenMP API 中使用的指令的链接。

Visual C++ 支持以下 OpenMP 指令。

对于并行工作共享：

指令	说明
parallel	定义并行区域，它是由多个线程并行执行的代码。
for	导致在并行区域内的 `for` 循环中完成的工作在线程之间划分。
部分	标识要在所有线程之间划分的代码节。
single	允许指定应在单个线程（不一定是主线程）上执行代码节。

对于主线程和同步：

指令	说明
master	指定仅主线程应执行程序的一部分。
严重	指定代码一次仅在一个线程上执行。
barrier	同步团队中的所有线程；所有线程都暂停在屏障处，直到所有线程执行屏障。
atomic	指定将以原子方式更新的内存位置。
flush	指定所有线程具有所有共享对象的内存的相同视图。
ordered	指定并行化 `for` 循环下的代码应像顺序循环一样执行。

对于数据环境：

指令	说明
threadprivate	指定变量专用于线程。

atomic

指定将以原子方式更新的内存位置。

#pragma omp atomic
   expression

参数

expression
具有 lvalue（要保护其内存位置免受多次写入）的语句。

备注

atomic 指令不支持任何子句。

有关详细信息，请参阅 2.6.4 atomic 构造。

示例

// omp_atomic.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

#define MAX 10

int main() {
   int count = 0;
   #pragma omp parallel num_threads(MAX)
   {
      #pragma omp atomic
      count++;
   }
   printf_s("Number of threads: %d\n", count);
}

Number of threads: 10

barrier

同步团队中的所有线程；所有线程都暂停在屏障处，直到所有线程执行屏障。

#pragma omp barrier

备注

barrier 指令不支持任何子句。

有关详细信息，请参阅 2.6.3 barrier 指令。

示例

有关如何使用 barrier 的示例，请参阅 master。

严重

指定代码一次仅在一个线程上执行。

#pragma omp critical [(name)]
{
   code_block
}

参数

name
（可选）用于标识关键代码的名称。名称必须用括号括起来。

注解

critical 指令不支持任何子句。

有关详细信息，请参阅 2.6.2 critical 构造。

示例

// omp_critical.cpp
// compile with: /openmp
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

#define SIZE 10

int main()
{
    int i;
    int max;
    int a[SIZE];

    for (i = 0; i < SIZE; i++)
    {
        a[i] = rand();
        printf_s("%d\n", a[i]);
    }

    max = a[0];
    #pragma omp parallel for num_threads(4)
        for (i = 1; i < SIZE; i++)
        {
            if (a[i] > max)
            {
                #pragma omp critical
                {
                    // compare a[i] and max again because max
                    // could have been changed by another thread after
                    // the comparison outside the critical section
                    if (a[i] > max)
                        max = a[i];
                }
            }
        }

    printf_s("max = %d\n", max);
}

41
18467
6334
26500
19169
15724
11478
29358
26962
24464
max = 29358

flush

指定所有线程具有所有共享对象的内存的相同视图。

#pragma omp flush [(var)]

参数

var
（可选）表示要同步的对象的变量的逗号分隔列表。如果未指定 var，则刷新所有内存。

备注

flush 指令不支持任何子句。

有关详细信息，请参阅 2.6.5 flush 指令。

示例

// omp_flush.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

void read(int *data) {
   printf_s("read data\n");
   *data = 1;
}

void process(int *data) {
   printf_s("process data\n");
   (*data)++;
}

int main() {
   int data;
   int flag;

   flag = 0;

   #pragma omp parallel sections num_threads(2)
   {
      #pragma omp section
      {
         printf_s("Thread %d: ", omp_get_thread_num( ));
         read(&data);
         #pragma omp flush(data)
         flag = 1;
         #pragma omp flush(flag)
         // Do more work.
      }

      #pragma omp section
      {
         while (!flag) {
            #pragma omp flush(flag)
         }
         #pragma omp flush(data)

         printf_s("Thread %d: ", omp_get_thread_num( ));
         process(&data);
         printf_s("data = %d\n", data);
      }
   }
}

Thread 0: read data
Thread 1: process data
data = 2

for

导致在并行区域内的 for 循环中完成的工作在线程之间划分。

#pragma omp [parallel] for [clauses]
   for_statement

参数

clauses
（可选）零个或多个子句，请参阅“备注”部分。

for_statement
for 循环。如果 for 循环中的用户代码更改索引变量，则将导致未定义的行为。

注解

for 指令支持以下子句：

如果还指定了 parallel，则 clauses 可以是由 parallel 或 for 指令接受的任何子句（nowait 除外）。

有关详细信息，请参阅 2.4.1 for 构造。

示例

// omp_for.cpp
// compile with: /openmp
#include <stdio.h>
#include <math.h>
#include <omp.h>

#define NUM_THREADS 4
#define NUM_START 1
#define NUM_END 10

int main() {
   int i, nRet = 0, nSum = 0, nStart = NUM_START, nEnd = NUM_END;
   int nThreads = 0, nTmp = nStart + nEnd;
   unsigned uTmp = (unsigned((abs(nStart - nEnd) + 1)) *
                               unsigned(abs(nTmp))) / 2;
   int nSumCalc = uTmp;

   if (nTmp < 0)
      nSumCalc = -nSumCalc;

   omp_set_num_threads(NUM_THREADS);

   #pragma omp parallel default(none) private(i) shared(nSum, nThreads, nStart, nEnd)
   {
      #pragma omp master
      nThreads = omp_get_num_threads();

      #pragma omp for
      for (i=nStart; i<=nEnd; ++i) {
            #pragma omp atomic
            nSum += i;
      }
   }

   if  (nThreads == NUM_THREADS) {
      printf_s("%d OpenMP threads were used.\n", NUM_THREADS);
      nRet = 0;
   }
   else {
      printf_s("Expected %d OpenMP threads, but %d were used.\n",
               NUM_THREADS, nThreads);
      nRet = 1;
   }

   if (nSum != nSumCalc) {
      printf_s("The sum of %d through %d should be %d, "
               "but %d was reported!\n",
               NUM_START, NUM_END, nSumCalc, nSum);
      nRet = 1;
   }
   else
      printf_s("The sum of %d through %d is %d\n",
               NUM_START, NUM_END, nSum);
}

4 OpenMP threads were used.
The sum of 1 through 10 is 55

主

指定仅主线程应执行程序的一部分。

#pragma omp master
{
   code_block
}

备注

master 指令不支持任何子句。

有关详细信息，请参阅 2.6.1 master 构造。

若要指定应在单个线程（不一定是主线程）上执行代码节，请改用 single 指令。

示例

// compile with: /openmp
#include <omp.h>
#include <stdio.h>

int main( )
{
    int a[5], i;

    #pragma omp parallel
    {
        // Perform some computation.
        #pragma omp for
        for (i = 0; i < 5; i++)
            a[i] = i * i;

        // Print intermediate results.
        #pragma omp master
            for (i = 0; i < 5; i++)
                printf_s("a[%d] = %d\n", i, a[i]);

        // Wait.
        #pragma omp barrier

        // Continue with the computation.
        #pragma omp for
        for (i = 0; i < 5; i++)
            a[i] += i;
    }
}

a[0] = 0
a[1] = 1
a[2] = 4
a[3] = 9
a[4] = 16

ordered

指定并行化 for 循环下的代码应像顺序循环一样执行。

#pragma omp ordered
   structured-block

注解

ordered 指令必须在带有 ordered 子句的 for 或 parallel for 构造的动态范围内。

ordered 指令不支持任何子句。

有关详细信息，请参阅 2.6.6 ordered 构造。

示例

// omp_ordered.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

static float a[1000], b[1000], c[1000];

void test(int first, int last)
{
    #pragma omp for schedule(static) ordered
    for (int i = first; i <= last; ++i) {
        // Do something here.
        if (i % 2)
        {
            #pragma omp ordered
            printf_s("test() iteration %d\n", i);
        }
    }
}

void test2(int iter)
{
    #pragma omp ordered
    printf_s("test2() iteration %d\n", iter);
}

int main( )
{
    int i;
    #pragma omp parallel
    {
        test(1, 8);
        #pragma omp for ordered
        for (i = 0 ; i < 5 ; i++)
            test2(i);
    }
}

test() iteration 1
test() iteration 3
test() iteration 5
test() iteration 7
test2() iteration 0
test2() iteration 1
test2() iteration 2
test2() iteration 3
test2() iteration 4

parallel

定义并行区域，它是由多个线程并行执行的代码。

#pragma omp parallel [clauses]
{
   code_block
}

参数

clauses
（可选）零个或多个子句，请参阅“备注”部分。

备注

parallel 指令支持以下子句：

parallel 还可以与 for 和 sections 指令一起使用。

有关详细信息，请参阅 2.3 parallel 构造。

示例

下面的示例演示如何设置线程数并定义并行区域。默认情况下，线程数等于计算机上的逻辑处理器数。例如，如果计算机具有一个启用了超线程处理的物理处理器，则会有两个逻辑处理器和两个线程。输出顺序因计算机而异。

// omp_parallel.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

int main() {
   #pragma omp parallel num_threads(4)
   {
      int i = omp_get_thread_num();
      printf_s("Hello from thread %d\n", i);
   }
}

Hello from thread 0
Hello from thread 1
Hello from thread 2
Hello from thread 3

节

标识要在所有线程之间划分的代码节。

#pragma omp [parallel] sections [clauses]
{
   #pragma omp section
   {
      code_block
   }
}

参数

clauses
（可选）零个或多个子句，请参阅“备注”部分。

备注

sections 指令可以包含零个或多个 section 指令。

sections 指令支持以下子句：

如果还指定了 parallel，则 clauses 可以是由 parallel 或 sections 指令接受的任何子句（nowait 除外）。

有关详细信息，请参阅 2.4.2 sections 构造。

示例

// omp_sections.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

int main() {
    #pragma omp parallel sections num_threads(4)
    {
        printf_s("Hello from thread %d\n", omp_get_thread_num());
        #pragma omp section
        printf_s("Hello from thread %d\n", omp_get_thread_num());
    }
}

Hello from thread 0
Hello from thread 0

single

允许指定应在单个线程（不一定是主线程）上执行代码节。

#pragma omp single [clauses]
{
   code_block
}

参数

clauses
（可选）零个或多个子句，请参阅“备注”部分。

备注

single 指令支持以下子句：

有关详细信息，请参阅 2.4.3 single 构造。

若要指定应仅在主线程上执行代码节，请改用 master 指令。

示例

// omp_single.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

int main() {
   #pragma omp parallel num_threads(2)
   {
      #pragma omp single
      // Only a single thread can read the input.
      printf_s("read input\n");

      // Multiple threads in the team compute the results.
      printf_s("compute results\n");

      #pragma omp single
      // Only a single thread can write the output.
      printf_s("write output\n");
    }
}

read input
compute results
compute results
write output

threadprivate

指定变量专用于线程。

#pragma omp threadprivate(var)

参数

var
要专用于线程的变量的逗号分隔列表。 var 必须是全局或命名空间范围的变量或局部静态变量。

备注

threadprivate 指令不支持任何子句。

threadprivate 指令基于使用 __declspec 关键字的 thread 特性；__declspec(thread) 限制应用于 threadprivate。例如，threadprivate 变量将存在于进程中启动的任何线程中，而不仅仅是属于并行区域生成的线程团队的线程。请注意此实现详细信息；你可能会注意到，threadprivate 用户定义的类型的构造函数调用的次数比预期多。

可以在进程启动时静态加载的 DLL 中使用 threadprivate，但是不能在通过 LoadLibrary 加载的任何 DLL 中使用 threadprivate，例如使用 /DELAYLOAD（延迟加载导入）的 DLL，这也使用 LoadLibrary。

无法保证易损坏类型的 threadprivate 变量调用其析构函数。例如：

struct MyType
{
    ~MyType();
};

MyType threaded_var;
#pragma omp threadprivate(threaded_var)
int main()
{
    #pragma omp parallel
    {}
}

用户无法控制构成并行区域的线程何时终止。如果进程退出时存在这些线程，则线程不会收到进程退出的通知，并且不会在除退出的线程（此处为主线程）以外的任何线程上为 threaded_var 调用析构函数。因此，代码不应指望正确销毁 threadprivate 变量。

有关详细信息，请参阅 2.7.1 threadprivate 指令。

示例

有关使用 threadprivate示例，请参阅 private。

通过

OpenMP 指令

atomic

参数

备注

示例

barrier

备注

示例

严重

参数

注解

示例

flush

参数

备注

示例

for

参数

注解

示例

主

备注

示例

ordered

注解

示例

parallel

参数

备注

示例

节

参数

备注

示例

single

参数

备注

示例

threadprivate

参数

备注

示例

反馈

其他资源