并发和内存模型 [英] Concurrency and memory models

查看:186
本文介绍了并发和内存模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在观看这个视频由Herb Sutter在GPGPU和新的C ++ AMP库。他正在谈论内存模型,并提到弱内存模型,然后强内存模型,我认为他指的是读/写顺序等,但我不知道。 / p>

Google在内存模型上提出了一些有趣的结果(主要是科学论文),但是有人可以解释什么是弱内存模型,什么是强内存模型及其关系并发?

解决方案

在并发方面,内存模型指定了数据访问的约束,



术语有些模糊,但是基本前提是强的内存模型对硬件施加了很多约束,以确保一个线程/内核/处理器的写入以清晰定义的顺序对其他线程/内核/处理器可见,同时允许程序员最大的自由度数据访问。



另一方面,弱模型对硬件施加很少的约束,而是将确保可见性的责任放在程序员手中。 / p>

最强的内存模型是顺序一致性:所有处理器对所有数据的所有操作形成由所有处理器约定的单个总订单,与每个处理器上的操作顺序一致。这实际上是每个处理器的操作交错。



最弱的内存模型不会对处理器看到对方写入的顺序施加任何限制。同一系统中的不同处理器可能以不同的顺序看到写入,并且一些处理器可能在由另一处理器写入相同的存储器地址之后长时间使用来自其自己的高速缓存的过时数据。有时,整个高速缓存行被视为单个单元,因此对高速缓存行上的一个变量的写入将导致从其它处理器到该高速缓存行上的对第一处理器还不可见的其他变量的写入被有效地丢弃,因为当它最终将高速缓存行写入存储器时,失效值被写在顶部。在这种方案下,必须非常小心以确保使用显式同步指令以正确的顺序将数据传送到其他处理器。



例如,英特尔x86内存模型通常被认为处于更强的一端,因为有关写入对其他处理器可见的顺序有严格的规定,而DEC Alpha和ARM处理器通常被认为具有弱的存储器模型,因为如果您在代码中明确地放置了排序指令(存储器栅栏或屏障),则来自一个处理器的写入只需要以特定顺序对其他处理器可见。



某些系统具有只能由特定处理器访问的内存。因此,在这些处理器之间传输数据需要明确的数据传输指令。这是Cell处理器的情况,并且通常也是GPU的情况。这可以被视为一个弱的内存模型的极端---如果你显式调用数据传输,数据只有其他处理器可见。



编程语言通常强加自己的内存模型在底层处理器提供的任何内容之上。例如,C ++ 0x指定从完全放松到完全顺序一致性的完整的排序约束集合,因此您可以在代码中指定所需要的。另一方面,Java有一个非常特定的排序约束集合,必须遵守并且不能改变。在这两种情况下,编译器必须将所需约束转换为底层处理器的相关指令,如果您在弱订单机器上请求连续一致性,这可能非常复杂。


I'm watching this video by Herb Sutter on GPGPU and the new C++ AMP library. He is talking about memory models and mentions Weak Memory Models and then Strong Memory Models and I think he's referring to read/write ordering etc, but I am however not sure.

Google turns up some interesting results (mostly science papers) on memory models, but can someone explain what is a Weak Memory Model and what is a Strong Memory Model and their relation to concurrency?

解决方案

In terms of concurrency, a memory model specifies the constraints on data accesses, and the conditions under which data written by one thread/core/processor becomes visible to another.

The terms weak and strong are somewhat ambiguous, but the basic premise is that a strong memory model places a lot of constraints on the hardware to ensure that writes by one thread/core/processor are visible to other threads/cores/processors in clearly-defined orders, whilst allowing the programmer maximum freedom of data access.

On the other hand, a weak model places very little constraints on the hardware, but instead places the responsibility of ensuring visibility in the hands of the programmer.

The strongest memory model is Sequential Consistency: all operations to all data by all processors form a single total order agreed on by all processors, which is consistent with the order of operations on each processor individually. This is essentially an interleaving of the operations of each processor.

The weakest memory model will not impose any restrictions on the order that processors see each other's writes. Different processors in the same system may see writes in different orders, and some processors may use "stale" data from their own cache for a long time after a write to the same memory address by another processor. Sometimes, whole cache lines are treated as a single unit, so a write to one variable on a cache line will cause writes from other processors to other variables on that cache line that are not yet visible to the first processor to be effectively discarded, as the stale values are written over the top when it eventually writes the cache line to memory. Under such a scheme, extreme care must be taken to ensure that data is transferred to other processors in the correct order, using explicit synchronization instructions.

For example, the Intel x86 memory model is generally considered to be on the stronger end, as there are strict rules about the order in which writes become visible to other processors, whereas the DEC Alpha and ARM processors are generally considered to have weak memory models, as writes from one processor are only required to be visible to other processors in a particular order if you explicitly put ordering instructions (memory fences or barriers) in your code.

Some systems have memory that is only accessible by particular processors. Transferring data between these processors therefore requires explicit data transfer instructions. This is the case with the Cell processors, and is often the case with GPUs as well. This can be viewed as an extreme of a weak memory model --- data is only visible to other processors if you explicitly invoke the data transfer.

Programming languages usually impose their own memory models on top of whatever is provided by the underlying processors. For example, C++0x specifies a complete set of ordering constraints ranging from completely relaxed to full sequential consistency, so you can specify in code what you require. On the other hand, Java has a very specific set of ordering constraints that must be adhered to and cannot be varied. In both cases the compiler must translate the desired constraints into the relevant instructions for the underlying processor, which may be quite involved if you request sequential consistency on a weakly ordered machine.

这篇关于并发和内存模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆