SSE2指令(例如orpd)有什么意义? [英] What is the point of SSE2 instructions such as orpd?

查看：324 发布时间：2020/9/12 22:46:30 assembly x86 sse instruction-set sse2

本文介绍了SSE2指令(例如orpd)有什么意义?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

orpd指令是压缩双精度浮点值的按位逻辑或".这不是和por(按位逻辑OR")完全一样吗?如果是这样，拥有它有什么意义?

The orpd instruction is a "bitwise logical OR of packed double precision floating point values". Doesn't this do exactly the same thing as por ("bitwise logical OR")? If so, what's the point of having it?

推荐答案

请记住，SSE1 orps 首先. (实际上 MMX por mm, mm/mem 甚至早于SSE1.)

Remember that SSE1 orps came first. (Well actually MMX por mm, mm/mem came even before SSE1.)

SSE2 orpd 指令具有相同的操作码，但带有新的前缀对于硬件解码器逻辑，我想就像movapd vs. movaps一样.像这样的一些指令在ps和pd版本之间是多余的，但是有些则不是，例如addps vs. addpd或unpcklps vs. unpcklpd是不同的改组.

Having the same opcode with a new prefix be the SSE2 orpd instruction makes sense for hardware decoder logic, I guess, just like movapd vs. movaps. Several instructions like this are redundant between between ps and pd versions, but some aren't, like addps vs. addpd or unpcklps vs. unpcklpd being different shuffles.

SSE2还引入 66 0F EB /r por xmm,xmm/mem 的原因至少部分是为了与MMX保持一致0F EB /r por mm, mm/mem，还是相同的操作码，但带有一个新的必需前缀.就像paddb mm, mm与paddb xmm, xmm一样.

The reason for SSE2 also introducing 66 0F EB /r por xmm,xmm/mem is at least partly for consistency with MMX 0F EB /r por mm, mm/mem, again same opcode with a new mandatory prefix. Just like paddb mm, mm vs. paddb xmm, xmm.

但是对于vec-integer与FP，可能会有不同的旁路转发域.不同的微体系结构实际解码和运行这些不同指令的方式具有不同的行为.有些人以相同的方式运行所有XMM or指令，从而在FP和simd-integer域之间转发时产生了额外的延迟.

But also for the possibility of different bypass-forwarding domains for vec-integer vs. FP. Different microarchitectures have had different behaviours for how they actually decoded and ran those different instructions. Some ran all the XMM or instructions the same way, creating extra latency for forwarding between FP and simd-integer domains.

对于FP-float和FP-double，实际上没有CPU具有不同的转发域，因此，是的， movapd和orpd实际上是无用的浪费空间，您永远不应该使用.改用较小的orps编码.

No CPUs have ever actually had different fowarding domains for FP-float vs. FP-double, so yes, movapd and orpd are in practice useless wastes of space that you should never use. Use the smaller orps encoding instead.

(或使用VEX编码都没有关系； vorps和vorpd的大小相同:2字节前缀+操作码+ modrm ...)

(Or with VEX encoding it doesn't matter; vorps and vorpd are the same size: 2 byte prefix + opcode + modrm ...)

有关在addps的FP数学指令之间或在paddb的SIMD整数insn之间使用orps时，绕过延迟的更多信息，请参见

For more about bypass delay when using por between FP math instructions like addps, or orps between SIMD-integer insns like paddb, see

在混合SSE整数/浮点SIMD指令时是否会受到性能损失
逻辑SSE内部函数之间有什么区别?
AVX指令vxorpd和vpxor之间的差异
混合使用pxor和xorps是否会影响性能?
有没有使用MOVDQU和MOVUPD比MOVUPS更好的情况?
在混合上下文中选择SSE指令执行域-在Skylake之前的版本中，整数版本具有更好的吞吐量.

Do I get a performance penalty when mixing SSE integer/float SIMD instructions
What's the difference between logical SSE intrinsics?
Difference between the AVX instructions vxorpd and vpxor
Does using mix of pxor and xorps affect performance?
Is there any situation where using MOVDQU and MOVUPD is better than MOVUPS?
Choosing SSE instruction execution domains in mixed contexts - pre-Skylake, integer versions have better throughput.

万一有人想知道，对标题的其他解释的答案是:FP值上的按位布尔值通常用于设置，清除或切换符号位.或使用cmpps/pd蒙版(例如混合)进行处理.

And in case anyone was wondering, the answer to the other interpretation of the title: bitwise booleans on FP values are mostly used to set, clear, or toggle the sign bit. Or to do stuff with cmpps/pd masks like blending.

这篇关于SSE2指令(例如orpd)有什么意义?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

SSE2指令(例如orpd)有什么意义? [英] What is the point of SSE2 instructions such as orpd?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

SSE2指令(例如orpd)有什么意义? [英] What is the point of SSE2 instructions such as orpd?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭