对OpenMP上下文中的firstprivate和threadprivate感到困惑 [英] Confused about firstprivate and threadprivate in OpenMP context

查看:685
本文介绍了对OpenMP上下文中的firstprivate和threadprivate感到困惑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我已经在对象中打包了一些资源,然后根据资源执行一些计算。我通常做的是初始化并行区域之外的对象,然后使用 firstprivte 关键字

Say I have packed some resources in an object, and then perform some computation based on the resources. What I normally do is to initialise the objects outside the parallel region, and then use firstprivte keywords

int main()
{
        // initialize Widget objs
         Widget Widobj{params1,params2,params3...};

        #pragma omp parallel for firstprivate(Widobj)
        for (int i=0; i< N; ++i)
          {
             // computation based on resources in Widobj
          }

}

这种情况下,每个线程将独立地处理Widobj中的资源,我认为每个线程都会有一个Widobj的副本(可能是一个深拷贝,我是对吗?现在我被其他关键字 threadprivate 困惑,threadprivate如何在这个上下文中工作?看起来非常相似

And I think in this case, each thread will deal with the resource in Widobj independently, and I suppose each thread will have a copy of Widobj(probably a deep copy, am I right?). Now I get confused by the other keyword threadprivate, how does threadprivate work in this context? Seems to me they are very similar

推荐答案

当一个对象被声明 firstprivate ,则调用复制构造函数,而当使用 private 时,将调用缺省构造函数。我们将在下面处理 threadprivate 。证明(Intel C ++ 15.0):

When an object is declared firstprivate, the copy constructor is called, whereas when private is used the default constructor is called. We'll address threadprivate below. Proof (Intel C++ 15.0):

#include <iostream>
#include <omp.h>

class myclass {
    int _n;
public:
    myclass(int n) : _n(n) { std::cout << "int c'tor\n"; }

    myclass() : _n(0) { std::cout << "def c'tor\n"; }

    myclass(const myclass & other) : _n(other._n)
    { std::cout << "copy c'tor\n"; }

    ~myclass() { std::cout << "bye bye\n"; }

    void print() { std::cout << _n << "\n"; }

    void add(int t) { _n += t; }
};

myclass globalClass;

#pragma omp threadprivate (globalClass)

int main(int argc, char* argv[])
{
    std::cout << "\nBegninning main()\n";

    myclass inst(17);

    std::cout << "\nEntering parallel region #0 (using firstprivate)\n";
#pragma omp parallel firstprivate(inst)
    {
        std::cout << "Hi\n";
    }

    std::cout << "\nEntering parallel region #1 (using private)\n";
#pragma omp parallel private(inst)
    {
        std::cout << "Hi\n";
    }

    std::cout << "\nEntering parallel region #2 (printing the value of "
                    "the global instance(s) and adding the thread number)\n";
#pragma omp parallel
    {
        globalClass.print();
        globalClass.add(omp_get_thread_num());
    }

    std::cout << "\nEntering parallel region #3 (printing the global instance(s))\n";
#pragma omp parallel
    {
        globalClass.print();
    }

    std::cout << "\nAbout to leave main()\n";
    return 0;
}

给予


def c'tor

def c'tor

Begninning main()

int c'tor

Begninning main()
int c'tor

输入并行区域#0(使用第一私人)

copy c'tor

Hi

bye

copy c'tor

Hi

bye bye

copy c'tor

Hi

bye bye < br>
copy c'tor

Hi

bye bye

Entering parallel region #0 (using firstprivate)
copy c'tor
Hi
bye bye
copy c'tor
Hi
bye bye
copy c'tor
Hi
bye bye
copy c'tor
Hi
bye bye

输入平行区域#1(使用私人)

def c'tor

Hi

bye bye

def c'tor

Hi

bye bye

def c'tor

Hi

bye

def c'tor

Hi

bye

Entering parallel region #1 (using private)
def c'tor
Hi
bye bye
def c'tor
Hi
bye bye
def c'tor
Hi
bye bye
def c'tor
Hi
bye bye

输入并行区域#2(打印全局实例的值并添加线程号)

def c'tor

0

def c'tor

0

def c'tor < br>
0

0

Entering parallel region #2 (printing the value of the global instance(s) and adding the thread number)
def c'tor
0
def c'tor
0
def c'tor
0
0

输入并行区域#3(打印全局实例)

0

1

2

3

Entering parallel region #3 (printing the global instance(s))
0
1
2
3

关于离开main()

bye bye

bye

About to leave main()
bye bye
bye bye

如果复制构造函数执行深层复制如果你必须编写自己的,并且默认情况下,如果你没有动态分配数据),那么你会得到你的对象的一个​​深拷贝。这与 private 相反,它不会使用现有对象初始化私有副本。

If the copy constructor does a deep copy (which it should if you have to write your own, and does by default if you don't and have dynamically allocated data), then you get a deep copy of your object. This is as opposed to private which doesn't initialize the private copy with an existing object.

threadprivate 的工作原理完全不同。首先,它只适用于全局变量或静态变量。更重要的是,它是一个指令本身,不支持没有其他条款。你在 threadprivate pragma行之后写入 #pragma omp parallel 之前的并行块。还有其他差异(在内存中存储对象等等),但这是一个好的开始。

threadprivate works totally differently. To start with, it's only for global or static variables. Even more critical, it's a directive in and of itself and supports no other clauses. You write the threadprivate pragma line somewhere and later the #pragma omp parallel before the parallel block. There are other differences (where in memory the object is stored, etc.) but that's a good start.

让我们分析上面的输出。
首先,注意,在进入区域#2时,默认构造函数被称为创建一个新的全局变量。这是因为在进入第一个并行区域时,全局变量的并行副本还不存在。

Let's analyze the above output. First, note that on entering region #2 the default constructor is called creating a new global variable private to the thread. This is because on entering the first parallel region the parallel copy of the global variable doesn't yet exist.

接下来,由于NoseKnowsAll认为最重要的区别,线程私有全局变量通过不同的平行区域持久。在区域#3中没有构造,并且我们看到保留来自区域#2的添加的OMP线程号。还要注意,在区域2和3中没有调用析构函数,而是在离开 main()(并且由于某种原因只有一个(主) c $ c> inst 这可能是一个错误...)。

Next, as NoseKnowsAll considers the most crucial difference, the thread private global variables are persistent through different parallel regions. In region #3 there is no construction and we see that the added OMP thread number from region #2 is retained. Also note that no destructor is called in regions 2 and 3, but rather after leaving main() (and only one (master) copy for some reason - the other is inst. This may be a bug...).

这就是为什么我使用英特尔编译器。 Visual Studio 2013以及g ++(4.6.2在我的电脑上, Coliru(g ++ v5.2) codeground(g ++ v4.9.2))只允许 POD 类型( source )。这被列为一个错误近十年,仍然没有被完全解决。
给出的Visual Studio错误是

This brings us to why I used the Intel compiler. Visual Studio 2013 as well as g++ (4.6.2 on my computer, Coliru (g++ v5.2), codingground (g++ v4.9.2)) allow only POD types (source). This is listed as a bug for almost a decade and still hasn't been fully addressed. The Visual Studio error given is


错误C3057:'globalClass':当前不支持'threadprivate'符号的动态初始化

error C3057: 'globalClass' : dynamic initialization of 'threadprivate' symbols is not currently supported

并且g ++给出的错误是

and the error given by g++ is


错误:'globalClass'首次使用后声明为'threadprivate'

error: 'globalClass' declared 'threadprivate' after first use

Intel编译器使用类。

The Intel compiler works with classes.

另一个注意。如果要复制主线程变量的值,可以使用 # pragma omp parallel copyin(globalVarName) 。请注意,这不会像上面的示例一样处理类(因此我将其忽略)。

One more note. If you want to copy the value of the master thread variable you can use #pragma omp parallel copyin(globalVarName). Note that this does not work with classes as in our example above (hence I left it out).

源: OMP教学私人 firstprivate threadprivate

这篇关于对OpenMP上下文中的firstprivate和threadprivate感到困惑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆