_builtin_prefetch()中的第二个参数有什么作用? [英] What is the effect of second argument in _builtin_prefetch()?

查看:832
本文介绍了_builtin_prefetch()中的第二个参数有什么作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

GCC文档此处指定使用_buitin_prefetch。



第三个参数是完美的。
如果它是0,编译器会生成prefetchtnta(%rax)指令
如果它是1,编译器会生成prefetcht2(%rax)指令
如果它是2,编译器会生成prefetcht1(%rax)指令
如果它是3(默认值),编译器将生成prefetcht0(%rax)指令。



如果我们改变第三个参数, p>

但是第二个参数似乎没有任何影响。

  __ builtin_prefetch(& x,1,2); 
__builtin_prefetch(& x,0,2);
__builtin_prefetch(& x,0,1);
__builtin_prefetch(& x,0,0);

以上是生成的示例代码片段:



以下是程序集:

  27:0f 18 10 prefetcht1(%rax)
2a:48 8d 45 fc lea -0x4(%rbp),%rax
2e:0f 18 10 prefetcht1(%rax)
31:48 8d 45 fc lea -0x4(%rbp),%rax
35:0f 18 18 prefetcht2(%rax)
38:48 8d 45 fc lea -0x4(%rbp),%rax
3c:0f 18 00 prefetchnta(%rax)

可以观察操作码与第三个参数的变化。但即使我改变了第二个参数(指定读或写),汇编代码仍然是一样的。 < 27,2a>和< 2e,31> ;.所以它不会给机器提供任何信息。那么第二个参数的目的是什么?

解决方案

您发布的同一个链接:


有两个可选参数, rw locality rw 的值是编译时常量1或0; 其中一个意思是预取准备写入内存地址,默认值为零表示预取准备进行读取。

x86架构在读取和写入预取之间没有区别。

这并不意味着您应该忽略第二个参数,因为在C中编写代码已完成以改善便携性。
即使在您的机器中没有使用第二个参数,编译到不同的架构时也可以使用它。
$ b 编辑
由于@PeterCordes在他的评论中指出,x86实际上有一个预取指令,用于预期写入。

它与其他预取指令不同,因为它使获取的行的其他缓存实例无效将其设为独占状态)。


The GCC doc here specifies the usage of _buitin_prefetch.

Third argument is perfect. If it is 0, compiler generates prefetchtnta (%rax) instruction If it is 1, compiler generates prefetcht2 (%rax) instruction If it is 2, compiler generates prefetcht1 (%rax) instruction If it is 3 (default), compiler generates prefetcht0 (%rax) instruction.

If we vary third argument the opcode already changed accordingly.

But second argument do not seem to have any effect.

__builtin_prefetch(&x,1,2);
__builtin_prefetch(&x,0,2);
__builtin_prefetch(&x,0,1);
__builtin_prefetch(&x,0,0);

The above is the sample piece of code, that generated:

The following is the assembly:

 27:    0f 18 10                prefetcht1 (%rax)
  2a:   48 8d 45 fc             lea    -0x4(%rbp),%rax
  2e:   0f 18 10                prefetcht1 (%rax)
  31:   48 8d 45 fc             lea    -0x4(%rbp),%rax
  35:   0f 18 18                prefetcht2 (%rax)
  38:   48 8d 45 fc             lea    -0x4(%rbp),%rax
  3c:   0f 18 00                prefetchnta (%rax)

One can observe the change in opcodes wrt 3rd argument. But even if I changed 2nd argument (that specifies read or write), the assembly code remains the same. <27,2a> and <2e,31>. So it not giving any information to the machine. Then what is the purpose of the second argument?

解决方案

From the same link you posted:

There are two optional arguments, rw and locality. The value of rw is a compile-time constant one or zero; one means that the prefetch is preparing for a write to the memory address and zero, the default, means that the prefetch is preparing for a read.

The x86 architecture has no distinction between a read and a write prefetch.
This doesn't mean that you should ignore the second argument as writing code in C is done to improve portability. Even if in your machine the second argument is not used, it can be used when compiling to different architectures.

EDIT As @PeterCordes pointed out in his comment, x86 actually have a prefetch instruction in anticipation of a write.
It differs from the other prefetch instructions as it invalidates other cached instanced of the line fetched (and set it to exclusive state).

这篇关于_builtin_prefetch()中的第二个参数有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆