掩码矢量Intel AVX-512指令的汇编语法 [英] Assembly syntax for masked vector Intel AVX-512 instructions
问题描述
出于测试目的,我正在使用Icc内联汇编程序为Intel Xeon Phi编写简短的汇编代码段.现在,我想使用带掩码的向量指令,但是无法将它们提供给内联汇编器.
For testing purposes, I am writing short assembly snippets for Intel's Xeon Phi with the Icc inline assembler. Now I wanted to use masked vector instructions, but I fail at feeding them to the inline assembler.
对于这样的代码:
vmovapd -64(%%r14, %%r10), %%zmm0{%%k1}
我收到错误消息
/tmp/icpc5115IWas_.s: Assembler messages:
/tmp/icpc5115IWas_.s:563: Error: junk `%k1' after register
我尝试了很多不同的组合,但是没有任何效果.使用GAS语法,Linux下的编译器版本为intel64/13.1up03.
I tried a lot of different combinations, but nothing worked. The compiler version is intel64/13.1up03 under Linux, using GAS syntax.
上面的代码实际上与未扩展的汇编程序一起使用.因此:
The code above actually works with non-extended assembler. So this:
__asm__("vmovapd -64(%r14, %r10), %zmm0{%k1} ")
有效,但以下各项无效:
works, while the following does not:
__asm__("vmovapd -64(%[src], %%r10), %%zmm0{%%k1} "
:
: [src]"r"(src)
:)
我想这与在扩展模式下的寄存器名称之前使用双精度%的必要性有关.但是不,k的单个%也不起作用.
I guess it has something to do with the necessity to use a double % before register names in extended mode. But no, a single % for the k does not work either.
推荐答案
I asked the same question in the Intel Developer zone http://software.intel.com/en-us/forums/topic/499145#comment-1776563, the answer is, that in order to use the mask registers on the Xeon Phi in extended inline assembler, you have to use double curly braces around the mask register modifier.
vmovapd %%zmm30, (%%r15, %%r10){{%%k1}}
这篇关于掩码矢量Intel AVX-512指令的汇编语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!