OpenMP共享与firstprivate性能 [英] OpenMP shared vs. firstprivate performancewise
问题描述
我在类方法中有 #pragma omp parallel for
循环。每个线程只读访问少量方法局部变量,很少调用私有数据和方法的参数。所有这些都在 shared
子句中声明。
我的问题:
I have a #pragma omp parallel for
loop inside a class method. Each thread readonly accesses few method local variables, few call private data and a method's parameter. All of them are declared in a shared
clause.
My questions:
- 性能明智不应该有任何区别声明这些
变量firstprivate
。 - 如果我不小心变量不共享同一个高速缓存行,是同样的吗?
- 如果一个共享变量是一个指针,在线程内部我通过它读取一个值,是否有像在普通循环中的混叠问题?
- Performance wise should not make any difference declare these
variables
shared
orfirstprivate
. Right? - Is the same true if I'm not careful about making variable not sharing the same cache line?
- If one of the shared variables is a pointer and inside the thread I read a value through it, is there an aliasing problem like in ordinary loops?
尝试配置我的代码。同时感谢您的建议!
Tomorrow I will try to profile my code. In the meantime thanks for your advice!
推荐答案
-
同样的事情。使用
shared
,它们在所有线程之间共享。使用firstprivate
,每个线程都有自己的副本。如果你只读取变量,那么最好将它保留为shared
,以免复制它。 (在C ++中,firstprivate
将隐式调用复制构造函数。)
Well, they're not the same thing. With
shared
, they are shared between all the threads. Withfirstprivate
, each thread gets it's own copy. If you're only reading the variable, then it's better to leave it asshared
as to avoid copying it. (In C++,firstprivate
will implicitly invoke the copy constructor.)
写入位于同一缓存行的值称为 false sharing 。
Correct, multiple threads reading and writing to values that sit on the same cacheline is called false sharing. The cache line will bounce back and forth between the cores that are accessing it - which can result in significant slowdown if it happens often enough.
如果你正在访问它的内核,缓存线会反弹回来,只是通过共享指针读取数据,那么应该不会有问题。
If you're just reading data through the shared pointer, then there shouldn't be a problem. But if you're also writing to it, then you need to make sure you don't have a race condition.
这篇关于OpenMP共享与firstprivate性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!