是否在AMD上以NOP(安全)方式执行Intel TSX前缀? [英] Are Intel TSX prefixes executed (safely) on AMD as NOP?

查看:338
本文介绍了是否在AMD上以NOP(安全)方式执行Intel TSX前缀?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我为同时在Intel和AMD x86机器上运行的应用程序提供了MASM同步代码.

I have MASM synchronizing code for an application which runs on both Intel and AMD x86 machines.

我想使用Intel TSX前缀(特别是XACQUIRE和XRELEASE)对其进行增强.

I'd like to enhance it using the Intel TSX prefixes, specifically XACQUIRE and XRELEASE.

如果我为Intel正确修改了我的代码,当我尝试在AMD机器上运行它时会发生什么?英特尔表示,它们的设计是向后兼容的,大概意味着它们什么也不做 在没有TSX的Intel CPU上运行.

If I modify my code correctly for Intel, what will happen when I attempt to run it on AMD machines? Intel says that these were designed to be backwards compatible, presumably meaning they do nothing on Intel CPUs without TSX.

我知道AMD尚未实施TSX.但是这些前缀可以安全地在AMD CPU上运行吗?这种行为是否记录在AMD手册中的某处,还是在冒火冒以为这是安全的并且将永远是安全的?

I know that AMD has not implemented TSX. But are these prefixes safe to run on AMD CPUs? Is this behavior documented in the AMD manuals somewhere or is it playing with fire to assume this is safe and will always be safe?

推荐答案

xacquire/xrelease只是F2/F3 REP前缀被所有不支持该功能的CPU(包括非Intel)安全地忽略.这就是英特尔选择前缀编码的原因.它甚至比必须解码为单独指令的NOP还要好.

xacquire/xrelease are just F2/F3 REP prefixes and are safely ignored by all CPUs that don't support that feature, including non-Intel. That's why Intel chose that encoding for the prefixes. It's even better than a NOP that has to decode as a separate instruction.

在一般情况下(跨供应商),CPU会忽略他们不理解的REP前缀.因此,新扩展名可以将REP用作其编码的一部分,如果这对于它们在旧版本中解码为其他内容很有用的话CPU,而不是#UD.

In general (across vendors), CPUs ignore REP prefixes they don't understand. So new extensions can use REP as part of their encoding if it's useful for them to decode as something else on old CPUs, instead of #UD.

我认为AMD在lock指令或mov-stores上为rep前缀引入不兼容的含义是不合理的,因为这会破坏已经使用这些前缀的实际二进制文件.例如,我很确定主流GNU/Linux发行版中的某些libpthread版本已经使用它来启用硬件锁定清除,并且为此不使用动态CPU调度来基于CPUID运行不同的代码.

I don't think it's plausible for AMD to introduce an incompatible meaning for rep prefixes on locked instructions or mov-stores - that would break real-world binaries that already uses these prefixes. For example I'm pretty sure some builds of libpthread in mainstream GNU/Linux distros have used this to enable hardware lock elision, and don't use dynamic CPU dispatching to run different code based on CPUID for this.

以前已经使用REP作为向后兼容的新指令的必需前缀,例如其中rep nop = pauserep bsf = tzcnt. (对于编译器很有用,因为tzcnt在某些CPU上速度更快,如果输入已知为非零,则给出相同的结果.)rep ret作为AMD之前的Bulldozer分支预测变量的一种解决方法,在GCC中得到了广泛使用-"rep ret"是什么意思?.实际上,毫无意义的REP实际上在AMD上是有效的(被忽略了).

Using REP as a mandatory prefix for a backwards-compat new instruction has been done before, e.g. with rep nop = pause or rep bsf = tzcnt. (Useful for compilers because tzcnt is faster on some CPUs, and gives the same result if the input is known non-zero.) And rep ret as a workaround for AMD pre-Bulldozer branch predictors is widely used by GCC - What does `rep ret` mean?. That meaningless REP definitely works (silently ignored) in practice on AMD.

(反之则是 not true.您不能编写依靠无意义的REP前缀被 future CPU忽略的软件.稍后再扩展可能会给它一个含义,例如,像rep bsr那样以lzcnt运行并给出不同的结果.这就是英特尔将无意义前缀的效果记录为未定义"的原因.)

(The reverse is not true. You can't write software that counts on a meaningless REP prefix being ignored by future CPUs. Some later extension might give it a meaning, e.g. like with rep bsr which runs as lzcnt and gives a different result. This is why Intel documents the effect of meaningless prefixes as "undefined".)

我想使用Intel TSX前缀(特别是XACQUIRE和XRELEASE)对其进行增强.

I'd like to enhance it using the Intel TSX prefixes, specifically XACQUIRE and XRELEASE.

不幸的是,微代码更新显然已禁用了所有Intel CPU上TSX的HLE(硬件锁定清除)部分. (也许可以减轻 TAA旁道攻击).此更新使32位字节块末尾的jcc在uop缓存中不可缓存,因此很难通过对现有代码进行基准测试来判断no-HLE部分的性能影响.

Unfortunately microcode updates have apparently disabled the HLE (Hardware Lock Elision) part of TSX on all Intel CPUs. (Perhaps to mitigate TAA side-channel attacks). This was the same update that made jcc at the end of a 32-byte block be uncacheable in the uop cache, so it's hard to tell from benchmarking existing code what perf impact the no-HLE part has.

https://news.ycombinator.com/item?id=21533791 /由于减轻了Spectre造成的硬件锁定清除已一去不复返了?(是的,但没有理由可能不是Spectre特有的.如果IDK回来了,则是IDK.)

https://news.ycombinator.com/item?id=21533791 / Has Hardware Lock Elision gone forever due to Spectre Mitigation? (yes gone, but no the reason probably isn't Spectre specifically. IDK if it will be back.)

如果要在x86上使用硬件事务存储,我认为您唯一的选择是TSX的另一半RTM(xbegin/xend).在最近的微代码更新之后,操作系统也可以禁用它.我不确定典型系统的默认设置是什么,将来可能会更改,因此需要在将开发时间投入任何工作之前进行检查.

If you want to use hardware transactional memory on x86, I think your only option is RTM (xbegin/xend), the other half of TSX. OSes can disable it, too, after the most recent microcode update; I'm not sure what the default is for typical systems, and this may change in the future, so this is something to check on before putting development time into anything.

没有AFAIK使用RTM的方法,但是透明地回到了锁定状态. xbegin/xend是非法指令,如果没有CPUID功能位,则会出现#UD错误.

There isn't AFAIK a way to use RTM but transparently fall back to locking; xbegin / xend are illegal instructions that fault with #UD if the CPUID feature bit isn't present.

如果您想要透明的向后兼容,则应该使用HLE,因此它(和TSX通常)经历了如此艰难的时间,一再被微代码更新禁用,实在令人遗憾. (以前在Haswell和Broadwell中是由于可能存在的正确性错误.它变成了查理·布朗的情况. )

If you wanted transparent backwards compat, you were supposed to use HLE so it's a real shame that it (and TSX in general) has had such a rough time, repeatedly getting disabled by microcode updates. (Previously in Haswell and Broadwell because of possible correctness bugs. It's turning into a Charlie Brown situation.)

这篇关于是否在AMD上以NOP(安全)方式执行Intel TSX前缀?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆