对OpenMP上下文中的firstprivate和threadprivate感到困惑 [英] Confused about firstprivate and threadprivate in OpenMP context
问题描述
假设我已经在对象中打包了一些资源,然后根据资源执行一些计算。我通常做的是初始化并行区域之外的对象,然后使用 firstprivte 关键字
Say I have packed some resources in an object, and then perform some computation based on the resources. What I normally do is to initialise the objects outside the parallel region, and then use firstprivte keywords
int main()
{
// initialize Widget objs
Widget Widobj{params1,params2,params3...};
#pragma omp parallel for firstprivate(Widobj)
for (int i=0; i< N; ++i)
{
// computation based on resources in Widobj
}
}
这种情况下,每个线程将独立地处理Widobj中的资源,我认为每个线程都会有一个Widobj的副本(可能是一个深拷贝,我是对吗?现在我被其他关键字 threadprivate 困惑,threadprivate如何在这个上下文中工作?看起来非常相似
And I think in this case, each thread will deal with the resource in Widobj independently, and I suppose each thread will have a copy of Widobj(probably a deep copy, am I right?). Now I get confused by the other keyword threadprivate, how does threadprivate work in this context? Seems to me they are very similar
推荐答案
当一个对象被声明 firstprivate
,则调用复制构造函数,而当使用 private
时,将调用缺省构造函数。我们将在下面处理 threadprivate
。证明(Intel C ++ 15.0):
When an object is declared firstprivate
, the copy constructor is called, whereas when private
is used the default constructor is called. We'll address threadprivate
below. Proof (Intel C++ 15.0):
#include <iostream>
#include <omp.h>
class myclass {
int _n;
public:
myclass(int n) : _n(n) { std::cout << "int c'tor\n"; }
myclass() : _n(0) { std::cout << "def c'tor\n"; }
myclass(const myclass & other) : _n(other._n)
{ std::cout << "copy c'tor\n"; }
~myclass() { std::cout << "bye bye\n"; }
void print() { std::cout << _n << "\n"; }
void add(int t) { _n += t; }
};
myclass globalClass;
#pragma omp threadprivate (globalClass)
int main(int argc, char* argv[])
{
std::cout << "\nBegninning main()\n";
myclass inst(17);
std::cout << "\nEntering parallel region #0 (using firstprivate)\n";
#pragma omp parallel firstprivate(inst)
{
std::cout << "Hi\n";
}
std::cout << "\nEntering parallel region #1 (using private)\n";
#pragma omp parallel private(inst)
{
std::cout << "Hi\n";
}
std::cout << "\nEntering parallel region #2 (printing the value of "
"the global instance(s) and adding the thread number)\n";
#pragma omp parallel
{
globalClass.print();
globalClass.add(omp_get_thread_num());
}
std::cout << "\nEntering parallel region #3 (printing the global instance(s))\n";
#pragma omp parallel
{
globalClass.print();
}
std::cout << "\nAbout to leave main()\n";
return 0;
}
给予
def c'tor
def c'tor
Begninning main()
int c'tor
Begninning main()
int c'tor
输入并行区域#0(使用第一私人)
copy c'tor
Hi
bye
copy c'tor
Hi
bye bye
copy c'tor
Hi
bye bye < br>
copy c'tor
Hi
bye bye
Entering parallel region #0 (using firstprivate)
copy c'tor
Hi
bye bye
copy c'tor
Hi
bye bye
copy c'tor
Hi
bye bye
copy c'tor
Hi
bye bye
输入平行区域#1(使用私人)
def c'tor
Hi
bye bye
def c'tor
Hi
bye bye
def c'tor
Hi
bye
def c'tor
Hi
bye
Entering parallel region #1 (using private)
def c'tor
Hi
bye bye
def c'tor
Hi
bye bye
def c'tor
Hi
bye bye
def c'tor
Hi
bye bye
输入并行区域#2(打印全局实例的值并添加线程号)
def c'tor
0
def c'tor
0
def c'tor < br>
0
0
Entering parallel region #2 (printing the value of the global instance(s) and adding the thread number)
def c'tor
0
def c'tor
0
def c'tor
0
0
输入并行区域#3(打印全局实例)
0
1
2
3
Entering parallel region #3 (printing the global instance(s))
0
1
2
3
关于离开main()
bye bye
bye
About to leave main()
bye bye
bye bye
如果复制构造函数执行深层复制如果你必须编写自己的,并且默认情况下,如果你没有动态分配数据),那么你会得到你的对象的一个深拷贝。这与 private
相反,它不会使用现有对象初始化私有副本。
If the copy constructor does a deep copy (which it should if you have to write your own, and does by default if you don't and have dynamically allocated data), then you get a deep copy of your object. This is as opposed to private
which doesn't initialize the private copy with an existing object.
threadprivate
的工作原理完全不同。首先,它只适用于全局变量或静态变量。更重要的是,它是一个指令本身,不支持没有其他条款。你在 threadprivate
pragma行之后写入 #pragma omp parallel
之前的并行块。还有其他差异(在内存中存储对象等等),但这是一个好的开始。
threadprivate
works totally differently. To start with, it's only for global or static variables. Even more critical, it's a directive in and of itself and supports no other clauses. You write the threadprivate
pragma line somewhere and later the #pragma omp parallel
before the parallel block. There are other differences (where in memory the object is stored, etc.) but that's a good start.
让我们分析上面的输出。
首先,注意,在进入区域#2时,默认构造函数被称为创建一个新的全局变量。这是因为在进入第一个并行区域时,全局变量的并行副本还不存在。
Let's analyze the above output. First, note that on entering region #2 the default constructor is called creating a new global variable private to the thread. This is because on entering the first parallel region the parallel copy of the global variable doesn't yet exist.
接下来,由于NoseKnowsAll认为最重要的区别,线程私有全局变量通过不同的平行区域持久。在区域#3中没有构造,并且我们看到保留来自区域#2的添加的OMP线程号。还要注意,在区域2和3中没有调用析构函数,而是在离开 main()
(并且由于某种原因只有一个(主) c $ c> inst 这可能是一个错误...)。
Next, as NoseKnowsAll considers the most crucial difference, the thread private global variables are persistent through different parallel regions. In region #3 there is no construction and we see that the added OMP thread number from region #2 is retained. Also note that no destructor is called in regions 2 and 3, but rather after leaving main()
(and only one (master) copy for some reason - the other is inst
. This may be a bug...).
这就是为什么我使用英特尔编译器。 Visual Studio 2013以及g ++(4.6.2在我的电脑上, Coliru(g ++ v5.2), codeground(g ++ v4.9.2))只允许 POD 类型( source )。这被列为一个错误近十年,仍然没有被完全解决。
给出的Visual Studio错误是
This brings us to why I used the Intel compiler. Visual Studio 2013 as well as g++ (4.6.2 on my computer, Coliru (g++ v5.2), codingground (g++ v4.9.2)) allow only POD types (source). This is listed as a bug for almost a decade and still hasn't been fully addressed. The Visual Studio error given is
错误C3057:'globalClass':当前不支持'threadprivate'符号的动态初始化
error C3057: 'globalClass' : dynamic initialization of 'threadprivate' symbols is not currently supported
并且g ++给出的错误是
and the error given by g++ is
错误:'globalClass'首次使用后声明为'threadprivate'
error: 'globalClass' declared 'threadprivate' after first use
Intel编译器使用类。
The Intel compiler works with classes.
另一个注意。如果要复制主线程变量的值,可以使用 # pragma omp parallel copyin(globalVarName)
。请注意,这不会像上面的示例一样处理类(因此我将其忽略)。
One more note. If you want to copy the value of the master thread variable you can use #pragma omp parallel copyin(globalVarName)
. Note that this does not work with classes as in our example above (hence I left it out).
源: OMP教学:私人, firstprivate , threadprivate
这篇关于对OpenMP上下文中的firstprivate和threadprivate感到困惑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!