如何在 64 位 C++ 代码中使用暂停汇编指令? [英] How do you use the pause assembly instruction in 64-bit C++ code?

查看:48
本文介绍了如何在 64 位 C++ 代码中使用暂停汇编指令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于 VC++ 2010 不支持 64 位代码中的内联汇编,我如何将 pause x86-64 指令添加到我的代码中?似乎没有像许多其他常见汇编指令(例如,__rdtsc()__cpuid() 等)那样的内在函数.

Since inlined assembly is not supported by VC++ 2010 in 64-bit code, how do I get a pause x86-64 instruction into my code? There does not appear to be an intrinsic for this like there is for many other common assembly instructions (e.g., __rdtsc(), __cpuid(), etc...).

在为什么方面,我希望指令帮助处理繁忙的等待用例,以便(超线程)CPU 可用于在所述 CPU 上运行的其他线程(请参阅:性能洞察 intel.com).pause 指令对于这个用例以及自旋锁实现非常有帮助,我不明白为什么 MS 没有将它作为内在的.

On the why side, I want the instruction to help with a busy wait use case, so that the (hyperthreaded) CPU is available to other threads running on said CPU (See: Performance Insights at intel.com). The pause instruction is very helpful for this use case as well as spin-lock implementations, I can't understand why MS did not include it as an intrinsic.

谢谢

推荐答案

哇,这是一个很难追踪的问题,但万一其他人需要 x86-64 pause 指令:

Wow, this was a very hard problem to track down, but in case anybody else needs the x86-64 pause instruction:

windows.h 中的 YieldProcessor() 宏扩展为未记录的 _mm_pause 内在函数,最终扩展为 pause 32 位和 64 位代码中的指令.

The YieldProcessor() macro from windows.h expands to the undocumented _mm_pause intrinsic, which ultimately expands to the pause instruction in 32-bit and 64-bit code.

顺便说一下,对于 YieldProcessor() 出现在 MSDN 中.

This is completely undocumented, by the way, with partial (and incorrect for VC++ 2010 documentation) for YieldProcessor() appearing in MSDN.

以下是 YieldProcessor() 宏块编译成的示例:

Here is an example of what a block of YieldProcessor() macros compiles into:

    19:     ::YieldProcessor();
000000013FDB18A0 F3 90                pause  
    20:     ::YieldProcessor();
000000013FDB18A2 F3 90                pause  
    21:     ::YieldProcessor();
000000013FDB18A4 F3 90                pause  
    22:     ::YieldProcessor();
000000013FDB18A6 F3 90                pause  
    23:     ::YieldProcessor();
000000013FDB18A8 F3 90                pause  

顺便说一下,在 Nehalem 架构上,每个暂停指令似乎平均产生大约 9 个周期的延迟(即,在 3.3 GHz CPU 上为 3 ns).

By the way, each pause instruction seems to produce about a 9 cycle delay on the Nehalem architecture, on the average (i.e., 3 ns on a 3.3 GHz CPU).

这篇关于如何在 64 位 C++ 代码中使用暂停汇编指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆