OpenMP和嵌套并行 [英] OpenMP and nested parallelism
问题描述
我想嵌套"并行使用OpenMP.这是一个玩具代码:
I would like to "nest" parallel for using OpenMP. Here is a toy code:
#include <iostream>
#include <cmath>
void subproblem(int m) {
#pragma omp parallel for
for (int j{0}; j < m; ++j) {
double sum{0.0};
for (int k{0}; k < 10000000; ++k) {
sum += std::cos(static_cast<double>(k));
}
#pragma omp critical
{ std::cout << "Sum: " << sum << std::endl; }
}
}
int main(int argc, const char *argv[]) {
int n{2};
int m{8};
#pragma omp parallel for
for (int i{0}; i < n; ++i) {
subproblem(m);
}
return 0;
}
这就是我想要的:
- 如果n> =(机器上的内核数),我只希望将第一个循环并行化.
- 如果n< (我的计算机上的内核数),我希望OpenMP在内部循环中启动线程,但我不希望线程总数超过我的计算机上的内核数.
到目前为止,我只找到了一种禁用嵌套并行性或始终允许它的解决方案,但是我正在寻找一种仅在启动的线程数低于内核数的情况下才启用嵌套并行性的方法.
So far, I have only found a solution that disables nested parallelism or always allow it, but I am looking at a way to enable it only if the number of threads launched is below the number of cores.
有使用任务的OpenMP解决方案吗?
Is there an OpenMP solution for that using tasks?
推荐答案
您可以告诉OpenMP在n * m迭代空间上将嵌套循环折叠"成单个并行部分,而不是使用一对嵌套的并行部分. :
Rather than using a pair of nested parallel sections, you can tell OpenMP to "collapse" the nested loops into a single parallel section over the n*m iteration space:
#pragma omp parallel for collapse(2)
for (int i{0}; i < n; ++i) {
for (int j{0}; j < m; ++j) {
// ...
}
}
这将使得无论n和m的相对值如何,都可以适当地划分功.
This will allow it to divide the work appropriately regardless of the relative values of n and m.
这篇关于OpenMP和嵌套并行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!