_builtin_prefetch()中的第二个参数有什么作用? [英] What is the effect of second argument in _builtin_prefetch()?
问题描述
GCC文档此处指定使用_buitin_prefetch。
第三个参数是完美的。
如果它是0,编译器会生成prefetchtnta(%rax)指令
如果它是1,编译器会生成prefetcht2(%rax)指令
如果它是2,编译器会生成prefetcht1(%rax)指令
如果它是3(默认值),编译器将生成prefetcht0(%rax)指令。
如果我们改变第三个参数, p>
但是第二个参数似乎没有任何影响。
__ builtin_prefetch(& x,1,2);
__builtin_prefetch(& x,0,2);
__builtin_prefetch(& x,0,1);
__builtin_prefetch(& x,0,0);
以上是生成的示例代码片段:
以下是程序集:
27:0f 18 10 prefetcht1(%rax)
2a:48 8d 45 fc lea -0x4(%rbp),%rax
2e:0f 18 10 prefetcht1(%rax)
31:48 8d 45 fc lea -0x4(%rbp),%rax
35:0f 18 18 prefetcht2(%rax)
38:48 8d 45 fc lea -0x4(%rbp),%rax
3c:0f 18 00 prefetchnta(%rax)
可以观察操作码与第三个参数的变化。但即使我改变了第二个参数(指定读或写),汇编代码仍然是一样的。 < 27,2a>和< 2e,31> ;.所以它不会给机器提供任何信息。那么第二个参数的目的是什么?
您发布的同一个链接:
有两个可选参数, rw 和 locality 。 rw 的值是编译时常量1或0; 其中一个意思是预取准备写入内存地址,默认值为零表示预取准备进行读取。
x86架构在读取和写入预取之间没有区别。
这并不意味着您应该忽略第二个参数,因为在C中编写代码已完成以改善便携性。
即使在您的机器中没有使用第二个参数,编译到不同的架构时也可以使用它。
$ b 编辑
由于@PeterCordes在他的评论中指出,x86实际上有一个预取指令,用于预期写入。
它与其他预取指令不同,因为它使获取的行的其他缓存实例无效将其设为独占状态)。
The GCC doc here specifies the usage of _buitin_prefetch.
Third argument is perfect. If it is 0, compiler generates prefetchtnta (%rax) instruction If it is 1, compiler generates prefetcht2 (%rax) instruction If it is 2, compiler generates prefetcht1 (%rax) instruction If it is 3 (default), compiler generates prefetcht0 (%rax) instruction.
If we vary third argument the opcode already changed accordingly.
But second argument do not seem to have any effect.
__builtin_prefetch(&x,1,2);
__builtin_prefetch(&x,0,2);
__builtin_prefetch(&x,0,1);
__builtin_prefetch(&x,0,0);
The above is the sample piece of code, that generated:
The following is the assembly:
27: 0f 18 10 prefetcht1 (%rax)
2a: 48 8d 45 fc lea -0x4(%rbp),%rax
2e: 0f 18 10 prefetcht1 (%rax)
31: 48 8d 45 fc lea -0x4(%rbp),%rax
35: 0f 18 18 prefetcht2 (%rax)
38: 48 8d 45 fc lea -0x4(%rbp),%rax
3c: 0f 18 00 prefetchnta (%rax)
One can observe the change in opcodes wrt 3rd argument. But even if I changed 2nd argument (that specifies read or write), the assembly code remains the same. <27,2a> and <2e,31>. So it not giving any information to the machine. Then what is the purpose of the second argument?
From the same link you posted:
There are two optional arguments, rw and locality. The value of rw is a compile-time constant one or zero; one means that the prefetch is preparing for a write to the memory address and zero, the default, means that the prefetch is preparing for a read.
The x86 architecture has no distinction between a read and a write prefetch.
This doesn't mean that you should ignore the second argument as writing code in C is done to improve portability.
Even if in your machine the second argument is not used, it can be used when compiling to different architectures.
EDIT
As @PeterCordes pointed out in his comment, x86 actually have a prefetch instruction in anticipation of a write.
It differs from the other prefetch instructions as it invalidates other cached instanced of the line fetched (and set it to exclusive state).
这篇关于_builtin_prefetch()中的第二个参数有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!