嵌套并行和折叠循环之间有区别吗? [英] Is there a difference between nested parallelism and collapsed for loops?

查看:173
本文介绍了嵌套并行和折叠循环之间有区别吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道启用嵌套并行性将允许嵌套的omp并行for循环也可以并行化.但是我在嵌套的for循环(for的for内)中使用了crash(2).

I know that enabling nested parallelism will allow for a nested omp parallel for loop to also be parallelized. But I use collapse(2) in my nested for loops (for inside of for) instead.

有区别吗?为什么或者为什么不?假设最好的情况是:循环索引与其他事物之间没有依存关系.

Is there a difference? Why or why not? Assume the best case scenario: no dependence between the loop indices and other things equal.

推荐答案

是的,有很大的区别-使用collapse(而不是collapsed).不要使用嵌套并行机制.

Yes there is a huge difference - use collapse (not collapsed). Do not use nested parallelism.

嵌套并行性意味着有独立的线程团队在不同级别的工作共享上工作.您可能会为CPU内核过度订购太多线程而遇到各种麻烦,或者由于某些线程属于错误的团队而无法使用CPU内核而无法使用CPU内核.从嵌套并行性中获得不错的性能是相当困难的.这就是为什么您通常需要显式启用它的原因.

Nested parallelism means that there are independent teams of threads working on the different levels of worksharing. You can run into all sorts of trouble either with oversubscribing CPU cores to too many threads - or not utilizing CPU cores because some threads are in the wrong team which has no work right now. It's rather hard to get decent performance out of nested parallelism. This is why you usually need to explicitly enable it.

另一方面,合拢循环意味着不同的循环在工作共享级别上是联合的.这样一来,一组线程(通常具有与可用CPU内核一样多的线程)就可以有效地处理循环的不同迭代.

Collapsing loops on the other hand means that the different loops are joint on a work-sharing level. This allows one team of threads (usually with as many threads as available CPU cores) to efficiently work the different iterations of the loops.

这篇关于嵌套并行和折叠循环之间有区别吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆