omp 有序子句如何工作? [英] How does the omp ordered clause work?

查看:25
本文介绍了omp 有序子句如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

vector<int> v;

#pragma omp parallel for ordered schedule(dynamic, anyChunkSizeGreaterThan1)
    for (int i = 0; i < n; ++i){
            ...
            ...
            ...
#pragma omp ordered
            v.push_back(i);
    }

这用 n 大小的有序列表填充 v.

This fills v with an n sized ordered list.

当到达 ompordered 块时,所有线程都需要等待可能的最低迭代线程完成,但是如果没有线程被指定为特定迭代呢?还是 OpenMP 运行时库总是确保最低迭代由某个线程处理?

When reaching the omp ordered block all threads need to wait for the lowest iteration possible thread to finish, but what if none of the threads was appointed that specific iteration? Or does the OpenMP runtime library always make sure that the lowest iteration is handled by some thread?

另外,为什么建议将ordered 子句与dynamic schedule 一起使用?static schedule 会影响性能吗?

Also why is it suggested that ordered clause be used along with the dynamic schedule? Would static schedule affect performance?

推荐答案

ordered 子句的工作原理如下:不同的线程并发执行,直到遇到 ordered 区域,即然后以与在串行循环中执行的顺序相同的顺序顺序执行.这仍然允许一定程度的并发,特别是如果 ordered 区域之外的代码部分具有大量运行时间.

The ordered clause works like this: different threads execute concurrently until they encounter the ordered region, which is then executed sequentially in the same order as it would get executed in a serial loop. This still allows for some degree of concurrency, especially if the code section outside the ordered region has substantial run time.

没有特别的理由使用 dynamic 调度代替 static 调度和小块大小.这一切都取决于代码的结构.由于 ordered 引入了线程之间的依赖关系,如果与具有默认块大小的 schedule(static) 一起使用,第二个线程将不得不等待第一个线程完成所有迭代,那么第三个线程将不得不等待第二个线程完成它的迭代(因此也是第一个),依此类推.可以使用 3 个线程和 9 次迭代(每个线程 3 个)轻松将其可视化:

There is no particular reason to use dynamic schedule instead of static schedule with small chunk size. It all depends on the structure of the code. Since ordered introduces dependency between the threads, if used with schedule(static) with default chunk size, the second thread would have to wait for the first one to finish all iterations, then the third thread would have to wait for the second one to finish its iterations (and hence for the first one too), and so on. One could easily visualise it with 3 threads and 9 iterations (3 per thread):

tid  List of     Timeline
     iterations
0    0,1,2       ==o==o==o
1    3,4,5       ==.......o==o==o
2    6,7,8       ==..............o==o==o

= 显示线程正在并行执行代码.o 是线程正在执行 ordered 区域的时间.. 是空闲的线程,等待轮到它执行 ordered 区域.使用 schedule(static,1) 会发生以下情况:

= shows that the thread is executing code in parallel. o is when the thread is executing the ordered region. . is the thread being idle, waiting for its turn to execute the ordered region. With schedule(static,1) the following would happen:

tid  List of     Timeline
     iterations
0    0,3,6       ==o==o==o
1    1,4,7       ==.o==o==o
2    2,5,8       ==..o==o==o

我相信这两种情况的区别非常明显.使用 schedule(dynamic) 上面的图片或多或少会变得随机,因为分配给每个线程的迭代列表是不确定的.它还会增加额外的开销.仅当每次迭代的计算量不同并且进行计算所需的时间比使用动态调度所增加的开销多得多时,它才有用.

I believe the difference in both cases is more than obvious. With schedule(dynamic) the pictures above would become more or less random as the list of iterations assigned to each thread is non-deterministic. It would also add an additional overhead. It is only useful if the amount of computation is different for each iteration and it takes much more time to do the computation than is the added overhead of using dynamic scheduling.

不要担心最低编号的迭代.它通常被处理到团队中的第一个线程以准备执行代码.

Don't worry about the lowest numbered iteration. It is usually handled to the first thread in the team to become ready to execute code.

这篇关于omp 有序子句如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆