为什么编译器会生成此程序集? [英] Why would a compiler generate this assembly?

查看:71
本文介绍了为什么编译器会生成此程序集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在逐步完成一些Qt代码时,我遇到了以下问题.函数QMainWindowLayout::invalidate()具有以下实现:

While stepping through some Qt code I came across the following. The function QMainWindowLayout::invalidate() has the following implementation:

void QMainWindowLayout::invalidate()
{
QLayout::invalidate()
minSize = szHint = QSize();
}

它被编译为此:

<invalidate()>        push   %rbx
<invalidate()+1>      mov    %rdi,%rbx
<invalidate()+4>      callq  0x7ffff4fd9090 <QLayout::invalidate()>
<invalidate()+9>      movl   $0xffffffff,0x564(%rbx)
<invalidate()+19>     movl   $0xffffffff,0x568(%rbx)
<invalidate()+29>     mov    0x564(%rbx),%rax
<invalidate()+36>     mov    %rax,0x56c(%rbx)
<invalidate()+43>     pop    %rbx
<invalidate()+44>     retq

从invalidate + 9到invalidate + 36的程序集看起来很愚蠢.首先,代码将-1写入%rbx + 0x564和%rbx + 0x568,但随后将-1从%rbx + 0x564加载回寄存器,只是将其写入%rbx + 0x56c.似乎编译器应该可以轻松地将其优化为立即执行的另一步操作.

The assembly from invalidate+9 to invalidate+36 seems stupid. First the code writes -1 to %rbx+0x564 and %rbx+0x568, but then it loads that -1 from %rbx+0x564 back into a register just to write it out to %rbx+0x56c. This seems like something the compiler should easily be able to optimize into just another move immediate.

那么这是愚蠢的代码(如果是这样,为什么编译器不对其进行优化?),或者这比仅使用另一个立即动作聪明又快速?

So is this stupid code (and if so, why wouldn't the compiler optimize it?) or is this somehow very clever and faster than using just another move immediate?

(注意:该代码来自ubuntu发行的普通发行版库,因此大概是由GCC在优化模式下编译的.minSizeszHint变量是QSize类型的普通变量.)

(Note: This code is from the normal release library build shipped by ubuntu, so it was presumably compiled by GCC in optimize mode. The minSize and szHint variables are normal variables of type QSize.)

推荐答案

当您说它很愚蠢时,不确定您是正确的.我认为编译器可能正在尝试优化代码大小.内存mov指令没有64位立即数.因此,编译器必须像上面一样生成2条mov指令.他们每个将是10个字节,生成的2个移动是14个字节.它已被写入,因此很可能没有内存延迟,因此我认为您不会在此处受到任何性能影响.

Not sure you're correct when you're saying it's stupid. I think the compiler might be trying to optimize the code size here. There is no 64-bit immediate to memory mov instruction. So the compiler has to generate 2 mov instructions just like it did above. Each of them would be 10 bytes, the 2 moves generated are 14 bytes. It's been written to so there is most likely no memory latency so I do not think you'll take any performance hit here.

这篇关于为什么编译器会生成此程序集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆