如何正确并行嵌套的for循环 [英] How to parallelize correctly a nested for loops

查看:548
本文介绍了如何正确并行嵌套的for循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用OpenMP工作并行嵌套循环标:

I'm working with OpenMP to parallelize a scalar nested for loop:

double P[N][N];
double x=0.0,y=0.0;

for (int i=0; i<N; i++)
{
    for (int j=0; j<N; j++)
    {
        P[i][j]=someLongFunction(x,y);
        y+=1;
    }
    x+=1;
}

在此循环中的重要的事情是,矩阵P必须是在标量和并行版本是相同的:

In this loop the important thing is that matrix P must be the same in both scalar and parallel versions:

我所有可能的试验没有成功...

All my possible trials didn't succeed...

推荐答案

这里的问题是,你已经添加迭代到迭代依赖性有:

The problem here is that you have added iteration-to-iteration dependencies with:

x+=1;
y+=1;

因此​​,作为code现在表示,它不是可并行化的。试图这样做会导致不正确的结果。 (因为你很可能看到)

Therefore, as the code stands right now, it is not parallelizable. Attempting to do so will result in incorrect results. (as you are probably seeing)

幸运的是,你的情况,你可以直接计算出它们不会引入这种依赖性:

Fortunately, in your case, you can directly compute them without introducing this dependency:

for (int i=0; i<N; i++)
{
    for (int j=0; j<N; j++)
    {
        P[i][j]=someLongFunction((double)i, (double)N*i + j);
    }
}

现在你可以试着在这个抛出一个OpenMP的编译,看看它的工作原理:

Now you can try throwing an OpenMP pragma over this and see if it works:

#pragma omp parallel for
for (int i=0; i<N; i++)
{
    for (int j=0; j<N; j++)
    {
        P[i][j]=someLongFunction((double)i, (double)N*i + j);
    }
}

这篇关于如何正确并行嵌套的for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆