如何在 64 位 C++ 代码中使用暂停汇编指令? [英] How do you use the pause assembly instruction in 64-bit C++ code?
问题描述
由于 VC++ 2010 不支持 64 位代码中的内联汇编,我如何将 pause
x86-64 指令添加到我的代码中?似乎没有像许多其他常见汇编指令(例如,__rdtsc()
、__cpuid()
等)那样的内在函数.
Since inlined assembly is not supported by VC++ 2010 in 64-bit code, how do I get a pause
x86-64 instruction into my code? There does not appear to be an intrinsic for this like there is for many other common assembly instructions (e.g., __rdtsc()
, __cpuid()
, etc...).
在为什么方面,我希望指令帮助处理繁忙的等待用例,以便(超线程)CPU 可用于在所述 CPU 上运行的其他线程(请参阅:性能洞察 intel.com).pause
指令对于这个用例以及自旋锁实现非常有帮助,我不明白为什么 MS 没有将它作为内在的.
On the why side, I want the instruction to help with a busy wait use case, so that the (hyperthreaded) CPU is available to other threads running on said CPU (See: Performance Insights at intel.com). The pause
instruction is very helpful for this use case as well as spin-lock implementations, I can't understand why MS did not include it as an intrinsic.
谢谢
推荐答案
哇,这是一个很难追踪的问题,但万一其他人需要 x86-64 pause
指令:
Wow, this was a very hard problem to track down, but in case anybody else needs the x86-64 pause
instruction:
windows.h
中的 YieldProcessor()
宏扩展为未记录的 _mm_pause
内在函数,最终扩展为 pause
32 位和 64 位代码中的指令.
The YieldProcessor()
macro from windows.h
expands to the undocumented _mm_pause
intrinsic, which ultimately expands to the pause
instruction in 32-bit and 64-bit code.
顺便说一下,对于 YieldProcessor() 出现在 MSDN 中.
This is completely undocumented, by the way, with partial (and incorrect for VC++ 2010 documentation) for YieldProcessor() appearing in MSDN.
以下是 YieldProcessor() 宏块编译成的示例:
Here is an example of what a block of YieldProcessor() macros compiles into:
19: ::YieldProcessor();
000000013FDB18A0 F3 90 pause
20: ::YieldProcessor();
000000013FDB18A2 F3 90 pause
21: ::YieldProcessor();
000000013FDB18A4 F3 90 pause
22: ::YieldProcessor();
000000013FDB18A6 F3 90 pause
23: ::YieldProcessor();
000000013FDB18A8 F3 90 pause
顺便说一下,在 Nehalem 架构上,每个暂停指令似乎平均产生大约 9 个周期的延迟(即,在 3.3 GHz CPU 上为 3 ns).
By the way, each pause instruction seems to produce about a 9 cycle delay on the Nehalem architecture, on the average (i.e., 3 ns on a 3.3 GHz CPU).
这篇关于如何在 64 位 C++ 代码中使用暂停汇编指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!