在 C++ 中使用内存屏障防止 Out of Thin Air 值 [英] Preventing of Out of Thin Air values with a memory barrier in C++
问题描述
让我们考虑以下 C++ 中的双线程并发程序:
Let's consider the following two-thread concurrent program in C++:
x,y
是全局变量,r1,r2
是线程本地的,store
和 load
到 int
是原子的.内存模型 = C++11
x,y
are globals, r1,r2
are thread-local, store
and load
to int
is atomic.
Memory model = C++11
int x = 0, int y = 0
r1 = x | r2 = y
y = r1 | x = r2
允许编译器将其编译为:
A compiler is allowed to compile it as:
int x = 0, int y = 0
r1 = x | r2 = 42
y = r1 | x = r2
| if(y != 42)
| x = r2 = y
而且,虽然它是线程内一致的,但它可能会导致疯狂的结果,因为该程序的执行可能导致 (x, y) = (42, 42)
And, while it is intra-thread consistent, it can result in wild results, because it is possible that execution of that program results in (x, y) = (42, 42)
这被称为凭空价值观问题.它存在,我们必须接受它.
It is called Out of Thin Air values problem. And it exists and we have to live with that.
我的问题是:内存屏障是否会阻止编译器进行疯狂的优化,从而导致无中生有的值?
My question is: Does a memory barrier prevent a compiler from doing wild optimizations that result in out-of-thin-air values?
例如:
[fence] = atomic_thread_fence(memory_order_seq_cst);
int x = 0, int y = 0
r1 = x | r2 = y
[fence] | [fence]
y = r1 | x = r2
推荐答案
相关:我对 什么形式保证非原子变量看不到凭空值并创建像原子松弛这样的数据竞赛理论上可以? 更详细地解释了 C++ 宽松原子内存模型的形式规则不排除凭空"值.但是他们确实在注释中排除了它们.这只是对使用 mo_relaxed
的程序进行形式验证的问题,而不是真正的实现.即使是非原子变量也是安全的,if你避免了未定义的行为(你在这个问题的代码中没有这样做).
Related: my answer on What formally guarantees that non-atomic variables can't see out-of-thin-air values and create a data race like atomic relaxed theoretically can? explains in more details that the formal rules of the C++ relaxed atomic memory model don't exclude "out of thin air" values. But they do exclude them in a note. This is a problem only for formal verification of programs using mo_relaxed
, not for real implementations. Even non-atomic variables are safe from this, if you avoid undefined behaviour (which you didn't in the code in this question).
您在 x
和 y
上有数据竞争未定义行为,因为它们是非 atomic
变量,所以 C++11 标准对允许发生的事情完全没有什么可说的.
You have data race Undefined Behaviour on x
and y
because they're non-atomic
variables, so the C++11 standard has absolutely nothing to say about what's allowed to happen.
对于没有正式内存模型的旧语言标准,人们应该使用 volatile
或普通 int
和编译器 + asm 屏障进行线程处理,这将是相关的,在这种情况下,行为可能取决于编译器以您期望的方式工作.但幸运的是,碰巧适用于当前实现"的糟糕过去已经过去了.线程在我们身后.
It would be relevant to look at this for older language standards without a formal memory model where people did threading anyway using volatile
or plain int
and compiler + asm barriers, where behaviour could depend on compilers working the way you expect in a case like this. But fortunately the bad old days of "happens to work on current implementations" threading are behind us.
障碍在这里没有帮助,没有任何东西可以创建同步;正如@davmac 解释的那样,没有什么需要排队"的障碍;在全局操作顺序中.将屏障视为一种操作,它使当前线程等待其先前的部分或全部操作变为全局可见;屏障不直接与其他线程交互.
Barriers are not helpful here with nothing to create synchronization; as @davmac explains, nothing requires the barriers to "line up" in the global order of operations. Think of a barrier as an operation that makes the current thread wait for some or all of its previous operations to become globally visible; barriers don't directly interact with other threads.
凭空存在的价值是由于未定义的行为而可能发生的一件事;允许编译器对非原子变量进行软件值预测,并发明写入无论如何肯定会写入的对象.如果有一个发布存储,或者一个宽松的存储 + 一个屏障,编译器可能不被允许在它之前发明写,因为那可以创建
Out-of-thin-air values is one thing that can happen as a result of that undefined behaviour; the compiler is allowed to do software value-prediction on non-atomic variables, and invent writes to objects that will definitely be written anyway. If there was a release-store, or a relaxed store + a barrier, the compiler might not be allowed to invent writes before it, because that could create
一般而言,从 C++11 语言律师的角度来看,您无法确保程序安全(除了互斥锁或使用原子手动滚动锁定以防止一个线程读取 x
而另一个正在编写它.)
In general from a C++11 language-lawyer perspective, there's nothing you can do to make your program safe (other than a mutex or hand-rolled locking with atomics to prevent one thread from reading x
while the other is writing it.)
除了可能会打败自动矢量化和其他东西,如果您指望积极优化此变量的其他用途.
Except maybe defeating auto-vectorization and stuff, if you were counting on other uses of this variable being aggressively optimized.
atomic_int x = 0, y = 0
r1 = x.load(mo_relaxed) | r2 = y.load(mo_relaxed)
y.store(r1, mo_relaxed) | x.store(r2, mo_relaxed)
值预测可以在线程 2 从 y
看到该值之前推测性地将 r2
的未来值放入管道中,但它实际上不能被其他人看到直到软件或硬件确定预测是正确的.(那将是在发明一种写法).
Value-prediction could speculatively get a future value for r2
into the pipeline before thread 2 sees that value from y
, but it can't actually become visible to other threads until the software or hardware knows for sure that the prediction was correct. (That would be inventing a write).
例如线程 2 被允许编译为
e.g. thread 2 is allowed to compile as
r2 = y.load(mo_relaxed);
if (r2 == 42) { // control dependency, not a data dependency
x.store(42, mo_relaxed);
} else {
x.store(r2, mo_relaxed);
}
但正如我所说,x = 42;
不能对其他线程可见,除非它是非推测性的(硬件或软件推测),所以值预测不能创造其他线程的值可以看到.C++11 标准保证原子
But as I said, x = 42;
can't become visible to other threads until it's non-speculative (hardware or software speculation), so value prediction can't invent values that other threads can see. The C++11 standard guarantees that atomics
我不知道/想不出有什么机制可以让 42
的存储在 y.load
看到之前实际上对其他线程可见实际 42.(即 LoadStore 重新排序负载与稍后的依赖存储).不过,我认为 C++ 标准并没有正式保证这一点.如果编译器能够证明在某些情况下 r2
总是 42,甚至删除控制依赖,那么也许真的是激进的线程间优化?
I don't know / can't think of any mechanism by which a store of 42
could actually be visible to other threads before the y.load
saw an actual 42. (i.e. LoadStore reordering of a load with a later dependent store). I don't think the C++ standard formally guarantees that, though. Maybe really aggressive inter-thread optimization if the compiler can prove that r2
will always be 42 in some cases, and remove even the control dependency?
获取加载或释放存储肯定足以阻止因果关系违规.这不完全是 mo_consume
,因为 r2
被用作一个值,而不是一个指针.
An acquire-load or release-store would definitely be sufficient to block causality violations. This isn't quite mo_consume
, because r2
is used as a value, not a pointer.
这篇关于在 C++ 中使用内存屏障防止 Out of Thin Air 值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!