Interlocked.CompareExchange是否也应该是一个可变变量? [英] Should Interlocked.CompareExchange also a volatile variable?

查看:63
本文介绍了Interlocked.CompareExchange是否也应该是一个可变变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下示例来自解决方案

不, volatile 根本没有帮助,当然也不是这个原因.它将仅给出该第一读获取"信息.语义,而不是有效地放松,但是任何一种方式都将编译为运行装入指令的类似asm.

如果您从CPU缓存中获取了一个脏值

CPU缓存是一致的,因此从CPU缓存中读取的任何内容都是该行的当前全局约定值.脏"是指只是意味着它与DRAM的内容不匹配,如果/被逐出,则必须将其写回.还可以从存储缓冲区转发装入值,对于该线程最近存储的值尚不全局可见,但这很好,互锁方法是完全障碍,导致也需要等待存储缓冲区耗尽./p>

如果您的意思是过时的,那不是不可能的,像MESI这样的缓存一致性协议可以阻止这种情况.这就是为什么如果高速缓存行已由该核心拥有(MESI已修改或互斥状态)的情况,那么CAS之类的互锁事件就不会太慢.参见神话程序员相信CPU缓存讨论一些Java volatile,我认为它们与C#volatile类似.

此C ++ 11答案也解释一些有关缓存一致性和asm的信息.(请注意,C ++ 11 volatile 与C#显着不同,并不意味着任何线程安全性或顺序,但仍意味着asm必须执行加载或存储操作,而不能优化为一个寄存器.)


在非x86上,在您甚至尝试 CAS之前,在初次阅读后运行额外的屏障指令(以使那些人获得语义)只会使事情变慢.(在包括x86-64的x86上,易失性读取与普通读取一样编译为同一个asm,除了它可以防止编译时重新排序.)

如果当前线程只是通过非互锁的 = 分配写入了某些内容,则不能将volatile读取优化为仅使用寄存器中的值.那也没有帮助;如果我们只是存储一些东西,并在寄存器中记住我们存储的内容,那么从存储缓冲区进行存储转发的负载在道德上就等同于仅使用寄存器值.

大多数无锁原子的好用例都是在争用程度较低的情况下,因此通常情况下,如果硬件无需等待很长时间来访问/拥有缓存行,事情就可以成功.因此,您想尽快处理无争议的案件.即使在激烈竞争的情况下有什么要收获的,也要避免使用 volatile ,这是我不认为的.

如果您曾经做过任何普通存储(具有 = 的分配,而不是互锁的RMW),则 volatile 也会对其产生影响.如果C# volatile 提供了类似于C ++ memory_order_seq_cst 的语义,则这可能意味着等待存储缓冲区耗尽,然后此线程中的后续内存操作才能运行.在情况下,如果不需要订购wrt,那么您将大大降低涉及商店的代码.其他负载/存储.如果您在此CAS代码之前创建了这样的商店,是的,您将等到该商店(以及所有以前的商店)在全球范围内可见,然后尝试重新加载它.这意味着重新加载+ CAS之后,CPU正等待执行的操作很可能不必旋转,因为CPU将拥有该行的所有权,但是我认为您可以从作为一部分的全部障碍中有效地获得类似的行为.一个互锁的CAS.

The following example comes from the MSDN.

public class ThreadSafe
{
    // Field totalValue contains a running total that can be updated
    // by multiple threads. It must be protected from unsynchronized 
    // access.
    private float totalValue = 0.0F;

    // The Total property returns the running total.
    public float Total { get { return totalValue; }}

    // AddToTotal safely adds a value to the running total.
    public float AddToTotal(float addend)
    {
        float initialValue, computedValue;
        do
        {
            // Save the current running total in a local variable.
            initialValue = totalValue;

            // Add the new value to the running total.
            computedValue = initialValue + addend;

            // CompareExchange compares totalValue to initialValue. If
            // they are not equal, then another thread has updated the
            // running total since this loop started. CompareExchange
            // does not update totalValue. CompareExchange returns the
            // contents of totalValue, which do not equal initialValue,
            // so the loop executes again.
        }
        while (initialValue != Interlocked.CompareExchange(ref totalValue, 
        computedValue, initialValue));
        // If no other thread updated the running total, then 
        // totalValue and initialValue are equal when CompareExchange
        // compares them, and computedValue is stored in totalValue.
        // CompareExchange returns the value that was in totalValue
        // before the update, which is equal to initialValue, so the 
        // loop ends.

        // The function returns computedValue, not totalValue, because
        // totalValue could be changed by another thread between
        // the time the loop ends and the function returns.
        return computedValue;
    }
}

Should the totalValue not be declared as volatile to get the freshest value possible? I imagine that if you get a dirty value from a CPU cache then the call to Interlocked.CompareExchange should take care of getting the freshest value and cause the loop to try again. Would the volatile keyword potentially save one unnecessary loop?

I guess it isn't 100% necessary to have the volatile keyword because the method has overloads that takes datatype such as long that don't support the volatile keyword.

解决方案

No, volatile wouldn't be helpful at all, and certainly not for this reason. It would just give that first read "acquire" semantics instead of effectively relaxed, but either way will compile to similar asm that runs a load instruction.

if you get a dirty value from a CPU cache

CPU caches are coherent, so anything you read from CPU cache is the current globally agreed-on value for this line. "Dirty" just means it doesn't match DRAM contents, and will have to get written-back if / when evicted. A load value can also be forwarded from the store buffer, for a value this thread stored recently that isn't yet globally visible, but that's fine, Interlocked methods are full barriers that result in waiting for the store buffer to drain as well.

If you mean stale, then no, that's impossible, cache coherency protocols like MESI prevent that. This is why Interlocked things like CAS aren't horribly slow if the cache line is already owned by this core (MESI Modified or Exclusive state). See Myths Programmers Believe about CPU Caches which talks some about Java volatiles, which I think are similar to C# volatile.

This C++11 answer also explains some about cache coherency and asm. (Note that C++11 volatile is significantly different from C#, and doesn't imply any thread-safety or ordering, but does still imply the asm must do a load or a store, not optimize into a register.)


On non-x86, running extra barrier instructions after the initial read (to give those acquire semantics) before you even try a CAS just slows things down. (On x86 including x86-64, a volatile read compiles to the same asm as a plain read, except it prevents compile-time reordering).

A volatile read can't be optimized into just using a value in a register if the current thread just wrote something via a non-interlocked = assignment. That's not helpful either; if we just stored something and remember in a register what we stored, a load that does store-forwarding from the store buffer is morally equivalent to just using the register value.

Most of the good use-cases for lock-free atomics are when contention is lowish, so usually things can succeed without hardware having to wait a long time for access / ownership of the cache line. So you want to make the uncontended case as fast as possible. Avoid volatile even if there was anything to gain from it in highly-contended cases, which I don't think there is anyway.

If you ever did any plain stores (assignments with =, not interlocked RMW), volatile would have an effect on those, too. That might mean waiting for the store buffer to drain before later memory ops in this thread can run, if C# volatile gives semantics like C++ memory_order_seq_cst. In that case, you'd be slowing down the code involving the stores a lot, if you didn't need ordering wrt. other loads/stores. If you did such a store before this CAS code, yeah you'd be waiting until the store (and all previous stores) were globally visible to try reloading it. This would mean a reload + CAS the CPU is waiting to do right after are very likely to not have to spin because the CPU will have ownership of that line, but I think you'd effectively get similar behaviour from the full barrier that's part of an Interlocked CAS.

这篇关于Interlocked.CompareExchange是否也应该是一个可变变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆