RMW指令是否被认为对现代x86有害? [英] Are RMW instructions considered harmful on modern x86?

查看：143 发布时间：2020/5/21 21:03:17 assembly optimization x86 intel

本文介绍了RMW指令是否被认为对现代x86有害?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我记得在为速度优化x86时通常应避免读-修改-写指令.也就是说，应避免使用add [rsi], 10之类的东西，它会增加存储在rsi中的存储位置.建议通常是将其拆分为一条读-修改指令，然后是一个存储区，如下所示:

I recall that read-modify-write instructions are generally to be avoided when optimizing x86 for speed. That is, you should avoid something like add [rsi], 10, which adds to the memory location stored in rsi. The recommendation was usually to split it into a read-modify instruction, followed by a store, so something like:

mov rax, 10
add rax, [rsp]
mov [rsp], rax

或者，您可以使用显式加载和存储以及reg-reg添加操作:

Alternately, you might use explicit load and stores and a reg-reg add operation:

mov rax, [esp]
add rax, 10
mov [rsp], rax

对于现代x86，这仍然是合理的建议吗(并且曾经吗?)?¹

Is this still reasonable advice (and was it ever?) for modern x86?¹

当然，在内存中的值被多次使用的情况下，RMW是不合适的，因为这将导致冗余的加载和存储.我对仅使用一次值的情况感兴趣.

Of course, in cases where a value from memory is used more than once, RMW is inappropriate, since you will incur redundant loads and stores. I'm interested in the case where a value is only used once.

基于对Godbolt的探索，所有icc，clang和gcc

Based on exploration in Godbolt, all of icc, clang and gcc prefer to use a single RMW instruction to compile something like:

void Foo::f() {
  x += 10;
}

进入:

Foo::f():
    add     QWORD PTR [rdi], 10
    ret

因此，至少当值仅使用一次时，至少大多数编译器似乎认为RMW很好.

So at least most compilers seem to think RMW is fine, when the value is only used once.

足够有趣的是，当增量值是全局值而不是成员值时，例如

，各种编译器不同意:

Interestingly enough, the various compilers do not agree when the incremented value is a global, rather than a member, such as:

int global;

void g() {
  global += 10;
}

在这种情况下，gcc和clang仍然是单个RMW指令，而

In this case, gcc and clang still a single RMW instruction, while icc prefers a reg-reg add with explicit loads and stores:

g():
        mov       eax, DWORD PTR global[rip]                    #5.3
        add       eax, 10                                       #5.3
        mov       DWORD PTR global[rip], eax                    #5.3
        ret

也许与RIP相对寻址和微融合限制有关?但是，icc13对-m32仍然执行相同的操作，因此，可能与需要32位位移的寻址模式有关.

Perhaps it is something to do with RIP relative addressing and micro-fusion limitations? However, icc13 still does the same thing with -m32 so perhaps it's more to do with the addressing mode requiring a 32-bit displacement.

¹我使用的是故意模糊的术语现代x86 ，基本上是指最后几代Intel和AMD笔记本电脑/台式机/服务器芯片.

¹I'm using the deliberately vague term modern x86 to basically mean the last few generations of Intel and AMD laptop/desktop/server chips.

RMW指令是否被认为对现代x86有害? [英] Are RMW instructions considered harmful on modern x86?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

RMW指令是否被认为对现代x86有害? [英] Are RMW instructions considered harmful on modern x86?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭