#pragma omp parallel 和 #pragma omp parallel for 之间的区别 [英] Difference between #pragma omp parallel and #pragma omp parallel for

查看:167
本文介绍了#pragma omp parallel 和 #pragma omp parallel for 之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 OpenMP 的新手,我一直在尝试运行一个使用 OpenMP 添加两个数组的程序.在 OpenMP 教程中,我了解到在 for 循环中使用 OpenMP 时,我们需要使用 #pragma omp parallel for.但我也用 #pragma omp parallel 尝试了同样的事情,它也给了我正确的输出.以下是我试图传达的代码片段.

#pragma omp parallel for{for(int i=0;i<n;i++){c[i]=a[i]+b[i];}}

 #pragma omp parallel{for(int i=0;i<n;i++){c[i]=a[i]+b[i];}}

这两者有什么区别?

解决方案

The

#pragma omp parallel:

将使用一组线程创建一个parallel region,其中每个线程将执行parallel region包围的整个代码块.

来自

#pragma omp parallel forchunk_size=1static schedule 会导致一些结果喜欢:

在代码方面,循环将被转换为逻辑上类似于:

for(int i=omp_get_thread_num(); i 

其中 omp_get_thread_num()

<块引用>

omp_get_thread_num 例程返回线程号,在调用线程的当前团队.

omp_get_num_threads()

<块引用>

返回当前团队中的线程数.在一个连续的omp_get_num_threads 程序部分返回 1.

或者换句话说,for(int i = THREAD_ID; i .THREAD_ID 范围从 0TOTAL_THREADS - 1TOTAL_THREADS 代表创建的团队线程总数在平行区域.

<块引用>

我了解到我们需要使用 #pragma omp parallel for while在 for 循环上使用 OpenMP.但我也尝试过同样的事情使用#pragma omp parallel,它也给了我正确的输出.

它给你相同的输出,因为在你的代码中:

 c[i]=a[i]+b[i];

array a 和 array b 只被读取,并且 array c[i] 是唯一被更新的,它的值不会不取决于迭代 i 将执行多少次.尽管如此,使用 #pragma omp parallel for 每个线程将更新自己的 i,而使用 #pragma omp parallel 线程将更新相同的 icode>is,因此会覆盖彼此的值.

现在尝试使用以下代码执行相同操作:

#pragma omp parallel for{for(int i=0;i<n;i++){c[i]= c[i] + a[i] + b[i];}}

#pragma omp for{for(int i=0;i<n;i++){c[i] = c[i] + a[i] + b[i];}}

您会立即注意到差异.

I am new to OpenMP and I have been trying to run a program which adds two arrays using OpenMP. In the OpenMP tutorial, I have learned that we need to use #pragma omp parallel for while using OpenMP on the for loop. But I have also tried the same thing with #pragma omp parallel and it is also giving me the correct output. Below are the code snippets of what I am trying to convey.

#pragma omp parallel for
{
      for(int i=0;i<n;i++)
       {  
            c[i]=a[i]+b[i];
       }
}

and

 #pragma omp parallel
{
      for(int i=0;i<n;i++)
       {  
            c[i]=a[i]+b[i];
       }
}

What is the difference between these two?

解决方案

The

#pragma omp parallel:

will create a parallel region with a team of threads, where each thread will execute the entire block of code that the parallel region encloses.

From the OpenMP 5.1 one can read a more formal description :

When a thread encounters a parallel construct, a team of threads is created to execute the parallel region (..). The thread that encountered the parallel construct becomes the primary thread of the new team, with a thread number of zero for the duration of the new parallel region. All threads in the new team, including the primary thread, execute the region. Once the team is created, the number of threads in the team remains constant for the duration of that parallel region.

The:

#pragma omp parallel for

will create a parallel region (as described before), and to the threads of that region the iterations of the loop that it encloses will be assigned, using the default chunk size, and the default schedule which is typically static. Bear in mind, however, that the default schedule might differ among different concrete implementation of the OpenMP standard.

From the OpenMP 5.1 you can read a more formal description :

The worksharing-loop construct specifies that the iterations of one or more associated loops will be executed in parallel by threads in the team in the context of their implicit tasks. The iterations are distributed across threads that already exist in the team that is executing the parallel region to which the worksharing-loop region binds.

Moreover,

The parallel loop construct is a shortcut for specifying a parallel construct containing a loop construct with one or more associated loops and no other statements.

Or informally, #pragma omp parallel for is a combination of the constructor #pragma omp parallel with #pragma omp for. In your case, this would mean that:

#pragma omp parallel for
{
      for(int i=0;i<n;i++)
       {  
            c[i]=a[i]+b[i];
       }
}

is semantically, and logically, the same as:

#pragma omp parallel
{
      #pragma omp for
      for(int i=0;i<n;i++)
       {  
            c[i]=a[i]+b[i];
       }
}

TL;DR: In your example, with #pragma omp parallel for the loop will be parallelized among threads (i.e., the loop iterations will be divided among threads), whereas with #pragma omp parallel all threads will execute (in parallel) all the loop iterations.

To make it more illustrative, with 4 threads the #pragma omp parallel, would result in something like:

whereas #pragma omp parallel for with a chunk_size=1 and a static schedule would result in something like:

Code-wise the loop would be transformed to something logically similar to:

for(int i=omp_get_thread_num(); i < n; i+=omp_get_num_threads())
{  
    c[i]=a[i]+b[i];
}

where omp_get_thread_num()

The omp_get_thread_num routine returns the thread number, within the current team, of the calling thread.

and omp_get_num_threads()

Returns the number of threads in the current team. In a sequential section of the program omp_get_num_threads returns 1.

or in other words, for(int i = THREAD_ID; i < n; i += TOTAL_THREADS). With THREAD_ID ranging from 0 to TOTAL_THREADS - 1, and TOTAL_THREADS representing the total number of threads of the team created on the parallel region.

I have learned that we need to use #pragma omp parallel for while using OpenMP on the for loop. But I have also tried the same thing with #pragma omp parallel and it is also giving me the correct output.

It gives you the same output, because in your code:

 c[i]=a[i]+b[i];

array a and array b are only read, and array c[i] is the only one being updated, and its value does not depend on how many times the iteration i will be executed. Nevertheless, with #pragma omp parallel for each thread will update its own i, whereas with #pragma omp parallel threads will be updating the same is, hence overriding each others values.

Now try to do the same with the following code:

#pragma omp parallel for
{
      for(int i=0;i<n;i++)
       {  
            c[i]= c[i] + a[i] + b[i];
       }
}

and

#pragma omp for
{
      for(int i=0;i<n;i++)
       {  
            c[i] = c[i] + a[i] + b[i];
       }
}

you will immediately notice the difference.

这篇关于#pragma omp parallel 和 #pragma omp parallel for 之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆