C ++ 0x内存模型和推测加载/存储 [英] C++0x memory model and speculative loads/stores

查看:106
本文介绍了C ++ 0x内存模型和推测加载/存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我正在阅读的内存模型是即将到来的C ++ 0x标准的一部分。但是,我对编译器允许做的一些限制有些困惑,特别是关于推测加载和存储。

So I was reading about the memory model that is part of the upcoming C++0x standard. However, I'm a bit confused about some of the restrictions for what the compiler is allowed to do, specifically about speculative loads and stores.

开始时,一些相关内容:

To start with, some of the relevant stuff:

Hans Boehm的页面关于线程和C ++ 0x中的内存模型

Boehm,主题无法实现为图书馆

Boehm和Adve,C ++并发内存模型的基础

Sutter,Prism:A Principle-Based Sequential Memory Model for Microsoft Native代码平台,N2197

Boehm,并发内存模型编译器结果,N2338

现在,基本思想本质上是Sequential Consistency for数据无竞争程序,这似乎是轻松的编程和允许编译器和硬件优化机会之间的一个体面的折中。数据竞争被定义为如果由不同线程对相同存储器位置的两次访问未被排序,其中至少一个存储到存储器位置,并且它们中的至少一个不是同步动作,则发生数据竞争。这意味着对共享数据的所有读/写访问必须通过一些同步机制,例如对原子变量的互斥或操作(以及,对于专家来说,可以对具有松弛的存储器排序的原子变量进行操作)

Now, the basic idea is essentially "Sequential Consistency for Data-Race-Free Programs", which seems to be a decent compromise between ease of programming and allowing the compiler and hardware opportunities to optimize. A data race is defined to occur if two accesses to the same memory location by different threads are not ordered, at least one of them stores to the memory location, and at least one of them is not a synchronization action. It implies that all read/write access to shared data must be via some synchronization mechanism, such as mutexes or operations on atomic variables (well, it is possible to operate on the atomic variables with relaxed memory ordering for experts only, but the default provides for sequential consistency).

鉴于此,我对在普通共享变量上的杂散或推测加载/存储的限制感到困惑。例如,在N2338中我们有例子

In light of this, I'm confused about the restrictions about spurious or speculative loads/stores on ordinary shared variables. For instance, in N2338 we have the example

switch (y) {
    case 0: x = 17; w = 1; break;
    case 1: x = 17; w = 3; break;
    case 2: w = 9; break;
    case 3: x = 17; w = 1; break;
    case 4: x = 17; w = 3; break;
    case 5: x = 17; w = 9; break;
    default: x = 17; w = 42; break;
}

编译器不允许转换为

tmp = x; x = 17;
switch (y) {
    case 0: w = 1; break;
    case 1: w = 3; break;
    case 2: x = tmp; w = 9; break;
    case 3: w = 1; break;
    case 4: w = 3; break;
    case 5: w = 9; break;
    default: w = 42; break;
}

因为如果y == 2,如果另一个线程同时更新x的问题。但是,为什么这是一个问题?这是一个数据竞赛,这是被禁止;在这种情况下,编译器只是通过写入x两次使它变得更糟,但即使一个写入对于数据竞争就足够了,不是?也就是说一个合适的C ++ 0x程序需要同步访问x,在这种情况下不再有数据竞争,并且虚假存储不会是一个问题吗?

since if y == 2 there is a spurious write to x which could be a problem if another thread were concurrently updating x. But, why is this a problem? This a data race, which is prohibited anyway; in this case, the compiler just makes it worse by writing to x twice, but even a single write would be enough for a data race, no? I.e. a proper C++0x program would need to synchronize access to x, in which case there would no longer be data race, and the spurious store wouldn't be a problem either?

我对N2197中的示例3.1.3和其他一些示例也有同样的困惑,但也许上述问题的解释也会解释这一点。

I'm similarly confused about Example 3.1.3 in N2197 and some of the other examples as well, but maybe an explanation for the above issue would explain that too.

编辑:答案:

推测存储是一个问题的原因是在上面的switch语句示例中,程序员可能有选择有条件地获得锁保护x仅当y = 2时。因此,推测存储可能引入原始代码中不存在的数据竞争,并且因此禁止变换。相同的参数也适用于N2197中的示例3.1.3。

The reason why speculative stores are a problem is that in the switch statement example above, the programmer might have elected to conditionally acquire the lock protecting x only if y != 2. Hence the speculative store might introduce a data race that was not there in the original code, and the transformation is thus forbidden. The same argument applies to Example 3.1.3 in N2197 as well.

推荐答案

to,但是注意在y == 2的情况下,在代码的第一位,x不被写入(或者读取)。在代码的第二位,它被写入两次。这更多的不同于写一次和写两次(至少,它是在现有的线程模型,如pthreads)。此外,存储根本不会存储的值比仅存储一次与存储两次的差别更大。由于这些原因,你不想让编译器用 tmp = x替换无操作; x = 17; x = tmp;

I'm not familiar with all the stuff you refer to, but notice that in the y==2 case, in the first bit of code, x is not written to at all (or read, for that matter). In the second bit of code, it is written twice. This is more of a difference than just writing once vs. writing twice (at least, it is in existing threading models such as pthreads). Also, storing a value which would not otherwise be stored at all is more of a difference than just storing once vs. storing twice. For both these reasons, you don't want compilers just replacing a no-op with tmp = x; x = 17; x = tmp;.

假设线程A想要假设没有其他线程修改x。它是合理的希望它被允许期望,如果y是2,它写一个值到x,然后读回它,它会得到它写的值。但是如果线程B正在同时执行你的第二位代码,那么线程A可以写入x并且稍后读取它,并获取原始值,因为线程B保存写之前的恢复。或者它可以回到17,因为线程B存储17之后的写,并且存储tmp回来之后线程A读取。线程A可以做任何它喜欢的同步,它不会帮助,因为线程B不同步。它不同步的原因(在y == 2情况下)是它不使用x。因此,代码使用x的特定位的概念对于线程模型是重要的,这意味着当不应该时,编译器不能改变代码以使用x。

Suppose thread A wants to assume that no other thread modifies x. It's reasonable to want it to be allowed to expect that if y is 2, and it writes a value to x, then reads it back, it will get back the value it has written. But if thread B is concurrently executing your second bit of code, then thread A could write to x and later read it, and get back the original value, because thread B saved "before" the write and restored "after" it. Or it could get back 17, because thread B stored 17 "after" the write, and stored tmp back again "after" thread A reads. Thread A can do whatever synchronisation it likes, and it won't help, because thread B isn't synchronised. The reason it's not synchronised (in the y==2 case) is that it's not using x. So the concept of whether a particular bit of code "uses x" is important to the threading model, which means compilers can't be allowed to change code to use x when it "shouldn't".

简而言之,如果你提出的转换被允许,引入一个伪写,那么它永远不可能分析一段代码,并得出结论,它不会修改x(或任何其他内存位置)。有很多方便的习语,因此是不可能的,如在线程之间共享不可变的数据而不同步。

In short, if the transformation you propose were allowed, introducing a spurious write, then it would never be possible to analyse a bit of code and conclude that it does not modify x (or any other memory location). There are a number of convenient idioms which would therefore be impossible, such as sharing immutable data between threads without synchronisation.

所以,虽然我不熟悉C ++ 0x的数据竞争的定义,我假设它包括一些条件,程序员被允许假设一个对象不被写入,并且这种转换将违反这些条件。我推测,如果y == 2,那么你的原始代码,以及并发代码: x = 42; x = 1;在另一个线程中,z = x ,未定义为数据竞争。或者至少如果它是一个数据竞赛,它不是一个允许z到结束值17或42。

So, although I'm not familiar with C++0x's definition of "data race", I assume that it includes some conditions where programmers are allowed to assume that an object is not written to, and that this transformation would violate those conditions. I speculate that if y==2, then your original code, together with concurrent code: x = 42; x = 1; z = x in another thread, is not defined to be a data race. Or at least if it is a data race, it's not one which permits z to end up with value either 17, or 42.

考虑在这个程序中,值2中的y可能用于指示有其他线程正在运行:不要修改x,因为我们不在这里同步,所以会引入数据竞争。也许没有同步的原因是,在所有其他情况下,y,没有其他线程运行访问x。对我来说似乎合理的C ++ 0x想要支持这样的代码:

Consider that in this program, the value 2 in y might be used to indicate, "there are other threads running: don't modify x, because we aren't synchronised here, so that would introduce a data race". Perhaps the reason there's no synchronisation at all, is that in all other cases of y, there are no other threads running with access to x. It seems reasonable to me that C++0x would want to support code like this:

if (single_threaded) {
    x = 17;
} else {
    sendMessageThatSafelySetsXTo(17);
}

显然,你不希望将其转换为:

Clearly then, you don't want that transformed to:

tmp = x;
x = 17;
if (!single_threaded) {
    x = tmp;
    sendMessageThatSafelySetsXTo(17);
}

这是基本上与您的示例中相同的转换,但只有2个,而不是足够让它看起来像一个良好的代码大小优化。

Which is basically the same transformation as in your example, but with only 2 cases, instead of there being enough to make it look like a good code-size optimisation.

这篇关于C ++ 0x内存模型和推测加载/存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆