是函数中调用内存屏障? [英] Is function call a memory barrier?

查看:211
本文介绍了是函数中调用内存屏障?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑这个C code:

 的extern挥发性INT hardware_reg;无效F(常量无效* SRC,为size_t LEN)
{
    无效* DST =<东西取代;    hardware_reg = 1;
    的memcpy(DST,SRC,LEN);
    hardware_reg = 0;
}

的memcpy()呼叫必须在两个任务之间进行。在一般情况下,因为编译器可能不知道将被调用的函数做,就不能调用重新排序,以该函数为前或分配之后。然而,在这种情况下,编译器知道该函数将做(甚至可以插入内置替代内嵌),它可以推断​​的memcpy()永远无法访问 hardware_reg 。这在我看来,编译器会看到没有麻烦移动的memcpy()通话,如果它想这样做。

所以,这样的问题:是一个函数调用单独足以发出内存屏障,将prevent重新排序,或者说是,不然,前后调用<$ C $后,在这种情况下需要一个明确的内存屏障C>的memcpy()?

请纠正我,如果我误解的东西。


解决方案

编译器不能重新排序的memcpy()操作之前, hardware_reg = 1 或之后 hardware_reg = 0 - 这就是挥发性将确保 - 至少就作为指令流编译器发射。函数调用不一定是记忆障碍,但它是一个序列点。

C99标准说,这个约挥发性(5.1.2.3/5程序执行):


  

目前序列点,挥发性对象是在这个意义上,previous访问是稳定
  尚未发生完成,后续访问。


因此​​,在由的memcpy psented顺序点重$ P $(),写的挥发性访问 1 到已经发生的,写的挥发性访问 0 可发生并非如此。

不过,有两件事我想指出的:


  1. 根据什么&LT;东西&GT; 是,如果不出意外与目标缓冲区完成,编译器可能无法完全删除的memcpy()操作。这是微软想出了 SecureZeroMemory()函数的原因。 SecureZeroMemory()运行在挥发性合格指向prevent优化写了。


  2. 挥发性并不一定意味着内存屏障(这是一个硬件的东西,而不仅仅是一个code顺序嘛),所以如果你再一个多PROC机或某些类型的硬件上运行,您可能需要显式调用内存屏障(也许 WMB()在Linux上)。

    与MSVC 8(VS 2005)开始,该挥发性关键字意味着相应的内存屏障微软的文档,所以一个独立的特定内存屏障的呼叫可能没有必要:


      

    此外,优化时,编译
      必须保持有序之中
      以及挥发性对象的引用
      至于全球其他对象的引用。
      尤其是,


      
      

        
    • 要挥发物(挥发性写)写有发行
        语义;到全局引用或
        之前有发生静态对象
        写在易失性对象
        指令序列将发生前
        在编译的性写
        二进制文件。


    •   
    • 挥发物(挥发性读)的读了收购语义;
        到全局或静态的参考
        即在读后发生对象
        在指令易失性存储器
        之后序列将出现
        挥发性读取编译的二进制。


    •   


Consider this C code:

extern volatile int hardware_reg;

void f(const void *src, size_t len)
{
    void *dst = <something>;

    hardware_reg = 1;    
    memcpy(dst, src, len);    
    hardware_reg = 0;
}

The memcpy() call must occur between the two assignments. In general, since the compiler probably doesn't know what will the called function do, it can't reorder the call to the function to be before or after the assignments. However, in this case the compiler knows what the function will do (and could even insert an inline built-in substitute), and it can deduce that memcpy() could never access hardware_reg. Here it appears to me that the compiler would see no trouble in moving the memcpy() call, if it wanted to do so.

So, the question: is a function call alone enough to issue a memory barrier that would prevent reordering, or is, otherwise, an explicit memory barrier needed in this case before and after the call to memcpy()?

Please correct me if I am misunderstanding things.

解决方案

The compiler cannot reorder the memcpy() operation before the hardware_reg = 1 or after the hardware_reg = 0 - that's what volatile will ensure - at least as far as the instruction stream the compiler emits. A function call is not necessarily a 'memory barrier', but it is a sequence point.

The C99 standard says this about volatile (5.1.2.3/5 "Program execution"):

At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred.

So at the sequence point represented by the memcpy(), the volatile access of writing 1 has to occurred, and the volatile access of writing 0 cannot have occurred.

However, there are 2 things I'd like to point out:

  1. Depending on what <something> is, if nothing else is done with the the destination buffer, the compiler might be able to completely remove the memcpy() operation. This is the reason Microsoft came up with the SecureZeroMemory() function. SecureZeroMemory() operates on volatile qualified pointers to prevent optimizing writes away.

  2. volatile doesn't necessarily imply a memory barrier (which is a hardware thing, not just a code ordering thing), so if you're running on a multi-proc machine or certain types of hardware you may need to explicitly invoke a memory barrier (perhaps wmb() on Linux).

    Starting with MSVC 8 (VS 2005), Microsoft documents that the volatile keyword implies the appropriate memory barrier, so a separate specific memory barrier call may not be necessary:

    Also, when optimizing, the compiler must maintain ordering among references to volatile objects as well as references to other global objects. In particular,

    • A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.

    • A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.

这篇关于是函数中调用内存屏障?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆