是函数中调用内存屏障? [英] Is function call a memory barrier?
问题描述
考虑这个C code:
的extern挥发性INT hardware_reg;无效F(常量无效* SRC,为size_t LEN)
{
无效* DST =<东西取代; hardware_reg = 1;
的memcpy(DST,SRC,LEN);
hardware_reg = 0;
}
的的memcpy()
呼叫必须在两个任务之间进行。在一般情况下,因为编译器可能不知道将被调用的函数做,就不能调用重新排序,以该函数为前或分配之后。然而,在这种情况下,编译器知道该函数将做(甚至可以插入内置替代内嵌),它可以推断的memcpy()
永远无法访问 hardware_reg
。这在我看来,编译器会看到没有麻烦移动的memcpy()
通话,如果它想这样做。
所以,这样的问题:是一个函数调用单独足以发出内存屏障,将prevent重新排序,或者说是,不然,前后调用<$ C $后,在这种情况下需要一个明确的内存屏障C>的memcpy()?
的请纠正我,如果我误解的东西。的
编译器不能重新排序的memcpy()
操作之前, hardware_reg = 1
或之后 hardware_reg = 0
- 这就是挥发性
将确保 - 至少就作为指令流编译器发射。函数调用不一定是记忆障碍,但它是一个序列点。
C99标准说,这个约挥发性
(5.1.2.3/5程序执行):
目前序列点,挥发性对象是在这个意义上,previous访问是稳定
尚未发生完成,后续访问。
块引用>因此,在由
的memcpy psented顺序点重$ P $()
,写的挥发性访问1
到已经发生的,写的挥发性访问0
可发生并非如此。不过,有两件事我想指出的:
根据什么
&LT;东西&GT;
是,如果不出意外与目标缓冲区完成,编译器可能无法完全删除的memcpy()
操作。这是微软想出了SecureZeroMemory()
函数的原因。SecureZeroMemory()
运行在挥发性
合格指向prevent优化写了。
挥发性
并不一定意味着内存屏障(这是一个硬件的东西,而不仅仅是一个code顺序嘛),所以如果你再一个多PROC机或某些类型的硬件上运行,您可能需要显式调用内存屏障(也许WMB()
在Linux上)。与MSVC 8(VS 2005)开始,该
挥发性
关键字意味着相应的内存屏障微软的文档,所以一个独立的特定内存屏障的呼叫可能没有必要:
此外,优化时,编译
必须保持有序之中
以及挥发性对象的引用
至于全球其他对象的引用。
尤其是,
要挥发物(挥发性写)写有发行
语义;到全局引用或
之前有发生静态对象
写在易失性对象
指令序列将发生前
在编译的性写
二进制文件。
挥发物(挥发性读)的读了收购语义;
到全局或静态的参考
即在读后发生对象
在指令易失性存储器
之后序列将出现
挥发性读取编译的二进制。
块引用>Consider this C code:
extern volatile int hardware_reg; void f(const void *src, size_t len) { void *dst = <something>; hardware_reg = 1; memcpy(dst, src, len); hardware_reg = 0; }
The
memcpy()
call must occur between the two assignments. In general, since the compiler probably doesn't know what will the called function do, it can't reorder the call to the function to be before or after the assignments. However, in this case the compiler knows what the function will do (and could even insert an inline built-in substitute), and it can deduce thatmemcpy()
could never accesshardware_reg
. Here it appears to me that the compiler would see no trouble in moving thememcpy()
call, if it wanted to do so.So, the question: is a function call alone enough to issue a memory barrier that would prevent reordering, or is, otherwise, an explicit memory barrier needed in this case before and after the call to
memcpy()
?Please correct me if I am misunderstanding things.
解决方案The compiler cannot reorder the
memcpy()
operation before thehardware_reg = 1
or after thehardware_reg = 0
- that's whatvolatile
will ensure - at least as far as the instruction stream the compiler emits. A function call is not necessarily a 'memory barrier', but it is a sequence point.The C99 standard says this about
volatile
(5.1.2.3/5 "Program execution"):At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred.
So at the sequence point represented by the
memcpy()
, the volatile access of writing1
has to occurred, and the volatile access of writing0
cannot have occurred.However, there are 2 things I'd like to point out:
Depending on what
<something>
is, if nothing else is done with the the destination buffer, the compiler might be able to completely remove thememcpy()
operation. This is the reason Microsoft came up with theSecureZeroMemory()
function.SecureZeroMemory()
operates onvolatile
qualified pointers to prevent optimizing writes away.
volatile
doesn't necessarily imply a memory barrier (which is a hardware thing, not just a code ordering thing), so if you're running on a multi-proc machine or certain types of hardware you may need to explicitly invoke a memory barrier (perhapswmb()
on Linux).Starting with MSVC 8 (VS 2005), Microsoft documents that the
volatile
keyword implies the appropriate memory barrier, so a separate specific memory barrier call may not be necessary:Also, when optimizing, the compiler must maintain ordering among references to volatile objects as well as references to other global objects. In particular,
A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.
A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.
这篇关于是函数中调用内存屏障?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!