原子负载可以合并到C ++内存模型中吗? [英] Can atomic loads be merged in the C++ memory model?

查看:109
本文介绍了原子负载可以合并到C ++内存模型中吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下C ++ 11代码段.对于GCC和clang,这将编译为两个(顺序一致)的foo加载. (编者注:编译器不会优化原子,请参见此问题与解答有关更多详细信息,特别是 http://wg21.link/n4455 标准讨论,可能会造成哪些问题?该标准并不能为程序员提供解决的工具.此语言律师问答是关于当前标准的,而不是编译器的工作.)

Consider the C++ 11 snippet below. For GCC and clang this compiles to two (sequentially consistent) loads of foo. (Editor's note: compilers do not optimize atomics, see this Q&A for more details, especially http://wg21.link/n4455 standards discussion about the problems this could create which the standard doesn't give programmers tools to work around. This language-lawyer Q&A is about the current standard, not what compilers do.)

C ++内存模型是否允许编译器将这两个负载合并为一个负载,并对x和y使用相同的值?

(编者注:这是标准小组正在研究的内容: http://wg21.link/n4455 http://wg21.link/p0062 .目前的书面标准允许不良行为. )

(Editor's note: this is something the standards group is working on: http://wg21.link/n4455 and http://wg21.link/p0062. The current standard on paper allows behaviours that are undesirable.)

我认为它无法合并这些负载,因为这意味着轮询原子不再有效,但是我无法在内存模型文档中找到相关的部分.

I think it cannot merge these loads, because that means that polling an atomic doesn't work anymore, but I cannot find the relevant part in the memory model documentation.

#include <atomic>
#include <cstdio>

std::atomic<int> foo;

int main(int argc, char **argv)
{
    int x = foo;
    int y = foo;

    printf("%d %d\n", x, y);
    return 0;
}

推荐答案

是的,因为我们无法观察到差异!

允许将您的代码段转换为以下内容(伪实现).

Yes, because we can not observe the difference!

An implementation is allowed to turn your snippet into the following (pseudo-implementation).

int __loaded_foo = foo;

int x = __loaded_foo;
int y = __loaded_foo;

原因是您无法观察到上述情况以及给定顺序一致性保证的两个 foo 负载之间的区别.

The reason is that there is no way for you to observe the difference between the above, and two separate loads of foo given the guarantees of sequential-consistency.

注意:不仅可以进行这种优化的编译器,处理器还可以简单地推断出没有办法可以观察到差异并加载foo一次—即使编译器可能要求它执行两次.

Note: It is not just the compiler that can make such an optimization, the processor can simply reason that there is no way in which you can observe the difference and load the value of foo once — even though the compiler might have asked it to do it twice.



给出一个继续以增量方式更新 foo 的线程,可以保证y具有相同的以后的写入值,与x的内容进行比较.

Given a thread that keeps on updating foo in an incremental fashion, what you are guaranteed is that y will have either the same, or a later written value, when compared to the contents of x.

// thread 1 - The Writer
while (true) {
  foo += 1;
}

// thread 2 - The Reader
while (true) {
  int x = foo;
  int y = foo;

  assert (y >= x); // will never fire, unless UB (foo has reached max value)
}                  

想象一下,由于某种原因,写线程在每次迭代时都会暂停执行(由于 context-switch 或其他实现定义的原因);您无法证明这是导致xy具有相同值的原因,或者是由于合并优化"引起的.

Imagine the writing thread for some reason pauses its execution on every iteration (because of a context-switch or other implementation defined reason); there is no way in which you can prove that this is what is causing both x and y to have the same value, or if it is because of a "merge optimization".


换句话说,鉴于本节中的代码,我们必须要潜在的结果:

In other words, we have to potential outcomes given the code in this section:

  1. 两次读取(x == y)之间没有新值写入 foo .
  2. 在两次读取(x < y)之间将新值写入 foo .
  1. No new value is written to foo between the two reads (x == y).
  2. A new value is written to foo between the two reads (x < y).

由于这两种情况中的任何一种都可能发生,因此实现可以随意缩小范围以仅始终执行其中之一来实现.我们绝对无法观察到差异.

Since any of the two can happen, an implementation is free to narrow down the scope to simply always execute one of them; we can in no way observe the difference.



只要我们无法观察到所表达的行为与执行期间的行为之间的差异,实现就可以进行所需的任何更改.

An implementation can make whatever changes it wants as long as we cannot observe any difference between the behavior which we expressed, and the behavior during execution.

这在[intro.execution]p1中涵盖:

本国际标准中的语义描述定义了 参数化非确定性抽象机.这个国际 标准对符合标准的结构没有要求 实现.特别是,他们无需复制或模仿 抽象机的结构.而是一致的实现 需要(strong)模拟(仅)抽象的可观察行为 机器,如下所述.

The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.

另一个使它更加清晰的部分:[intro.execution]p5:

Another section which makes it even more clear [intro.execution]p5:

执行格式正确的程序的符合要求的实现 产生与可能的执行之一相同的可观察到的行为 具有相同抽象机器的对应实例的 程序和相同的输入.

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.

进一步阅读:



// initial state
std::atomic<int> foo = 0;

// thread 1
while (true) {
  if (foo)
    break;
}

// thread 2
foo = 1

问题:鉴于前几节的推理,实现是否可以仅在线程1 中读取一次foo,然后即使发生中断也永远不会中断线程2 写入foo?

Question: Given the reasoning in the previous sections, could an implementation simply read foo once in thread 1, and then never break out of the loop even if thread 2 writes to foo?

答案;不.

在顺序一致的环境中,我们保证在 thread 1 中可见对 foo 的写操作;这意味着当发生写操作时,线程1 必须观察到这种状态变化.

In a sequentially-consistent environment we are guaranteed that a write to foo in thread 2 will become visible in thread 1; this means that when that write has happened, thread 1 must observe this change of state.

注意:由于我们无法观察到差异,因此一种实现方式可以将两次读取转换为一次读取(一次

Note: An implementation can turn two reads into a single one because we cannot observe the difference (one fence is just as effective as two), but it cannot completely disregard a read that exists by itself.

注意:此部分的内容由[atomics.order]p3-4保证.

Note: The contents of this section is guaranteed by [atomics.order]p3-4.



如果我真的想阻止这种形式的优化"怎么办?

如果您想强制实现在编写每个点时实际读取某个变量的值,则应研究 但是在实践中,编译器不会优化原子,并且标准组出于这种原因建议不要使用volatile atomic,直到尘埃落定为止.参见

But in practice compilers don't optimize atomics, and the standards group has recommended against using volatile atomic for this kind of reason until the dust settles on this issue. See

  • http://wg21.link/n4455
  • http://wg21.link/p0062
  • Why don't compilers merge redundant std::atomic writes?
  • and a duplicate of this question, Can and does the compiler optimize out two atomic loads?

这篇关于原子负载可以合并到C ++内存模型中吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆