不同的mmx,sse和avx版本是相互补充还是超集? [英] Are different mmx, sse and avx versions complementary or supersets of each other?

查看:455
本文介绍了不同的mmx,sse和avx版本是相互补充还是超集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想我应该熟悉x86 SIMD扩展.但是,甚至在我开始之前,我就遇到了麻烦.我无法很好地了解其中哪些仍然相关.

I'm thinking I should familiarize myself with x86 SIMD extensions. But before I even began I ran into trouble. I can't find a good overview on which of them are still relevant.

x86架构数十年来积累了许多数学/多媒体扩展:

The x86 architecture has accumulated a lot of math/multimedia extensions over decades:

  • MMX
  • 3DNow!
  • 上证所
  • SSE2
  • SSE3
  • SSSE3
  • SSE4
  • AVX
  • AVX2
  • AVX512
  • 我忘了什么吗?
  • MMX
  • 3DNow!
  • SSE
  • SSE2
  • SSE3
  • SSSE3
  • SSE4
  • AVX
  • AVX2
  • AVX512
  • Did I forget something?

新版本是旧版本的超集,反之亦然吗?还是互补?

Are the newer ones supersets of the older ones and vice versa? Or are they complementary?

其中一些被弃用了吗?以下哪项仍然有意义?我听说过有关旧版SSE"的信息.

Are some of them deprecated? Which of these are still relevant? I've heard references to "legacy SSE".

其中有些互斥吗? IE.它们共享相同的硬件部件吗?

Are some of them mutually exclusive? I.e. do they share the same hardware parts?

我应该一起使用哪个才能最大程度地提高现代Intel/AMD CPU的硬件利用率?为了争辩,我们假设我可以找到适当的用法说明...如果没有其他事情,请使用CPU为我的房子供暖.

Which should I use together to maximize hardware utilization on modern Intel / AMD CPUs? For sake of argument, let's assume I can find appropriate uses for the instructions... heating my house with the CPU if nothing else.

推荐答案

我最近为 SSE 更新了标签wiki, AVX x86 (和 SSE2 avx2 ).他们涵盖了很多. tl; dr摘要:AVX汇总了所有以前的SSE版本,并提供了这些指令的3操作数版本.也是大多数FP(AVX)和int(AVX2)insns的256b版本.

I recently updated the tag wikis for SSE, AVX, and x86 (and SSE2, avx2). They cover a lot of this. tl;dr summary: AVX rolls up all the previous SSE versions, and provides 3-operand versions of those instructions. Also 256b versions of most FP (AVX) and int (AVX2) insns.

有关各种SSE版本的摘要,请参见Wikipedia或knm241的更详细的答案.

For summaries of the various SSE versions, see wikipedia, or knm241's more-detailed answer.

我们真的不认为这会使SSE过时.更像是将AVX视为相同的旧SSE指令的新版本和更好的版本.它们仍以非AVX名称(例如,PSHUFB,而不是VPSHUFB)存在于参考手册中.您可以混合使用AVX和SSE代码,只要在需要时使用VZEROUPPER即可避免性能下降.将VEX与非VEX insns混合使用的问题(在Intel上).因此,您不得不调用可能运行非VEX SSE指令的库,或者您的代码使用SSE FP数学,但是只有在CPU支持的情况下才能运行某些AVX代码,这使您有些烦恼.

We don't really think of that making SSE obsolete. More like, think of AVX as a new and better version of the same old SSE instructions. They're still in the ref manual under their non-AVX names (PSHUFB, not VPSHUFB, for example.) You can mix AVX and SSE code, as long as you use VZEROUPPER when needed to avoid the performance problem from mixing VEX with non-VEX insns (on Intel). So there is some annoyance to dealing with cases where you have to call into libraries that might run non-VEX SSE instructions, or where your code uses SSE FP math, but also has some AVX code to be run only if the CPU supports it.

如果不发行CPU兼容性,则矢量指令的旧版SSE版本将真正过时,就像现在的MMX一样.如果将VEX编码的128b版本算作AVX(而不是SSE),AVX/AVX2的各个方面都至少要好一些.有时您仍会使用128b寄存器,因为您的数据仅以大块的形式出现,但是更经常地使用256b寄存器来一次对两倍的数据执行相同的运算.

If CPU-compatibility was a non-issue, the legacy-SSE versions of vector instructions would be truly obsolete, like MMX is now. AVX/AVX2 is at least slightly better in every way, if you count the VEX-encoded 128b version an insn as AVX, not SSE. Sometimes you'd still use 128b registers because your data only comes in chunks that big, but more often working with 256b registers to do the same op on twice as much data at once.

SSE/AVX/x87-FP/整数指令均使用相同的执行端口.通过混合它们,您无法并行完成更多工作. (在Haswell上除外,在Haswell上,这4个ALU端口之一只能处理非向量insns,例如GP reg ops和branchs.)

SSE/AVX/x87-FP/integer instructions all use the same execution ports. You can't get more done in parallel by mixing them. (except on Haswell, where one of the 4 ALU ports can only handle non-vector insns, like GP reg ops and branches).

这篇关于不同的mmx,sse和avx版本是相互补充还是超集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆