互锁和内存屏障 [英] Interlocked and Memory Barriers

查看:100
本文介绍了互锁和内存屏障的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对以下代码示例有疑问( m_value 不是易失的,每个线程都在单独的处理器上运行)

I have a question about the following code sample (m_value isn't volatile, and every thread runs on a separate processor)

void Foo() // executed by thread #1, BEFORE Bar() is executed
{
   Interlocked.Exchange(ref m_value, 1);
}

bool Bar() // executed by thread #2, AFTER Foo() is executed
{
   return m_value == 1;
}

在Foo()中使用 Interlocked.Exchange 是否可以保证在执行Bar()时看到值"1"? (即使该值已经存在于寄存器或高速缓存行中?)还是我需要在读取 m_value 的值之前放置一个内存屏障?

Does using Interlocked.Exchange in Foo() guarantees that when Bar() is executed, I'll see the value "1"? (even if the value already exists in a register or cache line?) Or do I need to place a memory barrier before reading the value of m_value?

(与原始问题无关),声明一个可变成员并通过引用 InterlockedXX 方法将其传递是否合法? (编译器警告有关通过引用传递挥发物,因此在这种情况下我应该忽略警告吗?)

Also (unrelated to the original question), is it legal to declare a volatile member and pass it by reference to InterlockedXX methods? (the compiler warns about passing volatiles by reference, so should I ignore the warning in such case?)

请注意,我不是在寻找更好的做事方式",所以请不要发布暗示完全替代的做事方式的答案(改用锁"等) .),这个问题纯粹出于兴趣.

Please Note, I'm not looking for "better ways to do things", so please don't post answers that suggest completely alternate ways to do things ("use a lock instead" etc.), this question comes out of pure interest..

推荐答案

常见的内存屏障使用模式与您在关键部分的实现中所使用的模式匹配,但是对于生产者和消费者来说,成对分成两部分.例如,关键部分的实现通常采用以下形式:

The usual pattern for memory barrier usage matches what you would put in the implementation of a critical section, but split into pairs for the producer and consumer. As an example your critical section implementation would typically be of the form:


while (!pShared->lock.testAndSet_Acquire()) ;
// (this loop should include all the normal critical section stuff like
// spin, waste, 
// pause() instructions, and last-resort-give-up-and-blocking on a resource 
// until the lock is made available.)

// Access to shared memory.

pShared->foo = 1 
v = pShared-> goo

pShared->lock.clear_Release()

上面的获取内存屏障可确保将成功修改锁之前可能已经开始的所有负载(pShared-> goo)扔掉,并在必要时重新启动.

Acquire memory barrier above makes sure that any loads (pShared->goo) that may have been started before the successful lock modification are tossed, to be restarted if neccessary.

释放内存屏障可确保在清除保护共享内存的锁定字之前,完成了从goo到(本地说)变量v的加载.

The release memory barrier ensures that the load from goo into the (local say) variable v is complete before the lock word protecting the shared memory is cleared.

在典型的生产者和消费者原子标记场景中,您有类似的模式(很难通过样本来确定这是否是您正在做的事情,但应该说明这一想法).

You have a similar pattern in the typical producer and consumer atomic flag scenerio (it is difficult to tell by your sample if that is what you are doing but should illustrate the idea).

假设您的生产者使用了原子变量来表示可以使用其他状态.您会想要这样的东西:

Suppose your producer used an atomic variable to indicate that some other state is ready to use. You'll want something like this:


pShared->goo = 14

pShared->atomic.setBit_Release()

在生产者中没有写入"障碍的情况下,您无法保证在goo存储通过cpu存储队列以及内存层次结构之前,硬件不会到达原子存储.它是可见的(即使您有一种机制可以确保编译器按照所需的方式对事物进行排序).

Without a "write" barrier here in the producer you have no guarantee that the hardware isn't going to get to the atomic store before the goo store has made it through the cpu store queues, and up through the memory hierarchy where it is visible (even if you have a mechanism that ensures the compiler orders things the way you want).

在消费者中


if ( pShared->atomic.compareAndSwap_Acquire(1,1) )
{
   v = pShared->goo 
}

在这里没有读取"障碍,您将不会知道在原子访问完成之前硬件还没有消失并为您提供了便利.原子(即:由互锁功能处理的内存执行诸如锁cmpxchg之类的操作)相对于自身而言只是原子的",而不是其他内存.

Without a "read" barrier here you won't know that the hardware hasn't gone and fetched goo for you before the atomic access is complete. The atomic (ie: memory manipulated with the Interlocked functions doing stuff like lock cmpxchg), is only "atomic" with respect to itself, not other memory.

现在,剩下要提到的是,屏障构造极难携带.您的编译器可能为大多数原子操作方法提供了_acquire和_release变体,这些是您使用它们的方式.取决于您使用的平台(即ia32),这些可能恰好是没有_acquire()或_release()后缀的情况.与此相关的平台是ia64(实际上是无效的,但在HP上仍然略有跳动的HP除外)和powerpc. ia64在大多数加载和存储指令(包括像cmpxchg之类的原子指令)上都具有.acq和.rel指令修饰符. powerpc为此有单独的说明(isync和lwsync分别为您提供了读取和写入障碍).

Now, the remaining thing that has to be mentioned is that the barrier constructs are highly unportable. Your compiler probably provides _acquire and _release variations for most of the atomic manipulation methods, and these are the sorts of ways you would use them. Depending on the platform you are using (ie: ia32), these may very well be exactly what you would get without the _acquire() or _release() suffixes. Platforms where this matters are ia64 (effectively dead except on HP where its still twitching slightly), and powerpc. ia64 had .acq and .rel instruction modifiers on most load and store instructions (including the atomic ones like cmpxchg). powerpc has separate instructions for this (isync and lwsync give you the read and write barriers respectively).

现在.说了这么多.您真的有充分的理由走这条路吗?正确地完成所有这些工作可能非常困难.准备好应对代码审阅中的大量自我怀疑和不安全感,并确保您使用各种随机时序场景进行大量的高并发测试.使用关键部分,除非您有非常充分的理由避免使用该部分,并且不要自己编写该关键部分.

Now. Having said all this. Do you really have a good reason for going down this path? Doing all this correctly can be very difficult. Be prepared for a lot of self doubt and insecurity in code reviews and make sure you have a lot of high concurrency testing with all sorts of random timing scenerios. Use a critical section unless you have a very very good reason to avoid it, and don't write that critical section yourself.

这篇关于互锁和内存屏障的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆