OpenMP线程“违反"障碍物 [英] OpenMP threads "disobey" omp barrier

查看：85 发布时间：2020/5/21 1:24:55 openmp

本文介绍了OpenMP线程“违反"障碍物的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

所以这是代码:

#pragma omp parallel private (myId)
{
  set_affinity();

  myId = omp_get_thread_num(); 

  if (myId<myConstant)
  {
    #pragma omp for schedule(static,1)
    for(count = 0; count < AnotherConstant; count++)
      {
        //Do stuff, everything runs as it should
      }
  }

#pragma omp barrier //all threads wait as they should
#pragma omp single
 {
    //everything in here is executed by one thread as it should be
 }
   #pragma omp barrier //this is the barrier in which threads run ahead
   par_time(cc_time_tot, phi_time_tot, psi_time_tot);
   #pragma omp barrier
}
//do more stuff

现在要解释发生了什么.在我的并行区域的开始，将myId设置为private，以便每个线程都有其正确的线程ID. set_affinity()控制哪个线程在哪个内核上运行.我遇到的问题涉及schedule(static，1)的#pragma omp.

Now to explain whats going on. At the start of my parallel region myId is set to private so that every thread has its correct thread id. set_affinity() controls which thread runs on which core. The issue I have involves the #pragma omp for schedule(static,1).

方块:

  if (myId<myConstant)
  {
    #pragma omp for schedule(static,1)
    for(count = 0; count < AnotherConstant; count++)
      {
        //Do stuff, everything runs as it should
      }
  }

代表我要分配给一定数量的线程(通过myConstant-1分配0)的一些工作.在这些线程上，我想均匀地(以schedule(static，1)的方式)分布循环的迭代.这一切都正确执行.

Represents some work that I want to distribute over a certain number of threads, 0 through myConstant-1. On these threads I want to evenly (in the manner which schedule(static,1) does) distribute the iterations of the loop. This is all performed correctly.

然后代码进入单个区域，其中的所有命令均按应有的方式执行.但是说我将myConstant指定为2.然后，如果我使用3个或更多线程运行，则通过单一材料进行的所有操作均正确执行，但是ID为3或更大的线程不会等待单个中的所有命令完成.

Then the code enters a single region, all commands in there are performed as they should be. But say I specify myConstant to be 2. Then if I run with 3 threads or more, everything through the single material executes correctly, but threads with id 3 or greater do not wait for all the commands within the single to finish.

在单个函数中，调用了一些函数来创建由所有线程执行的任务. id为3或更大(通常为myConstant或更大)的线程继续运行，执行par_time()，而其他线程仍在执行由单个代码中执行的函数创建的任务. par_time()只是为每个线程输出一些数据.

Within the single some functions are called that create tasks which are carried out by all threads. The threads with id of 3 or more (in general of myConstant or more) continue on, executing par_time() while the other threads are still carrying out tasks created by the functions executed in the single. par_time() just prints some data for each thread.

如果我注释掉schedule(static，1)的编译指示，并且只有一个线程执行for循环(例如，将if语句更改为if(myId == 0))，那么一切正常.所以我不确定为什么前面提到的线程会继续向前.

If I comment out the pragma omp for schedule(static,1) and just have a single thread execute the for loop (change if statement to if(myId==0) for instance), then everything works. So I'm just not sure why the aforementioned threads are continuing onwards.

让我知道是否有任何令人困惑的问题，这是一个特定的问题.我一直在寻找是否有人发现我的OMP流控制存在缺陷.

Let me know if anything is confusing, it's kind of a specific issue. I was looking so see if anyone saw a flaw in my flow control with OMP.

推荐答案

如果您查看OpenMP V3.0规范，则第2.5节工作共享构造"指出:

If you look at the OpenMP V3.0 spec, section 2.5 Worksharing Constructs, states:

以下限制适用于工作共享构造:

The following restrictions apply to worksharing constructs:

团队中的所有线程都必须遇到每个工作共享区域或根本没有.
必须遇到的工作共享区域和障碍区域的顺序每个线程中的每个线程都相同团队.

Each worksharing region must be encountered by all threads in a team or by none at all.
The sequence of worksharing regions and barrier regions encountered must be the same for every thread in a team.

通过在if中进行工作共享，您违反了这两个限制，从而使程序不符合要求.根据规范，不合格的OpenMP程序具有未指定"的行为.

By having the the worksharing for within the if, you have violated both of these restrictions making your program non-conforming. A non-conforming OpenMP program has "unspecified" behavior according to the specification.

关于将使用哪些线程来执行for循环，并且调度类型为"static，1"，第一个工作块(在这种情况下为count = 0)将分配给线程0.下一个块(count = 1)将分配给线程1，依此类推，直到分配了所有块.如果块多于线程，则分配将以循环方式在线程0重新开始.您可以在OpenMP规范的2.5.1节循环构造"中的计划"子句下的说明中阅读确切的用语.

As to which threads will be used to execute the for loop, with the schedule type of "static,1", the first chunk of work - in this case count=0 - will be assigned to thread 0. The next chunk (count=1) will be assigned to thread 1, etc. until all chunks are assigned. If there are more chunks than threads then assignment will restart at thread 0 in a round-robin fashion. You can read the exact wording in the OpenMP spec, section 2.5.1 Loop construct, under description where it talks about the schedule clause.

这篇关于OpenMP线程“违反"障碍物的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

OpenMP线程“违反"障碍物 [英] OpenMP threads "disobey" omp barrier

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

OpenMP线程“违反"障碍物 [英] OpenMP threads &quot;disobey&quot; omp barrier

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

OpenMP线程“违反"障碍物 [英] OpenMP threads "disobey" omp barrier

登录关闭