为什么64位VC ++编译器在函数调用后添加nop指令? [英] Why does 64-bit VC++ compiler add nop instruction after function calls?
问题描述
我已经使用Visual Studio C ++ 2008 SP1, x64
C ++
编译器编译了以下文件:
我很好奇,为什么编译器将那些<$ c那些呼叫
s之后的$ c> nop 指令?
PS1。我会理解,第二个和第三个 nop
将以4字节的空白对齐代码,但是第一个 nop
打破了这一假设。
PS2。编译的C ++代码中没有循环或特殊的优化工作:
CTestDlg :: CTestDlg(CWnd * pParent / * = NULL * /)
:CDialog(CTestDlg :: IDD,pParent)
{
m_hIcon = AfxGetApp()-> LoadIcon(IDR_MAINFRAME);
//这没有任何意义。我用它来设置调试器断点
:: GdiFlush();
srand(:: GetTickCount());
}
PS3。 其他信息: 首先,谢谢大家的投入。
以下是其他观察结果:
-
我的第一个猜测是
- 我尝试使用较新的链接器进行构建,即使
x64由
看起来有些不同,它仍然在某些情况下添加了VS 2013生成的
代码nop
通话
s:
- 也
动态
与静态
链接到MFC对出现nop
没什么影响。这是使用VS 2013
:
与MFC dll动态链接而构建的
- 还要注意,那些
nop
可以在附近
和远
呼叫$ c之后出现$ c> s也是如此,它们与对齐方式无关。如果我进一步走,这是我从
IDA
中获得的部分代码:
如您所见,
nop
插入了far
呼叫
之后,恰好对齐了下一个lea
B
地址上的code>指令!- 我本来倾向于相信,因为
近
相对
通话
s(即以<$ c $开头的c> E8 )比<$ c $
链接器可能会尝试通过
近
呼叫
首先,由于它们比far
call
s短一个字节,如果成功,则可能会填充剩余空间最后是nop
。但是随后上面的示例(5)打破了这个假设。
因此,我对此仍然没有明确的答案。
解决方案这纯粹是一种猜测,但这可能是一种SEH优化。我说 optimization 是因为SEH在没有NOP的情况下也可以正常工作。 NOP可能有助于加快放宽速度。
在以下示例中(使用VC2017进行实时演示),在调用
basic_string :: assign
后插入了NOP
在test1
中,但不在test2
中(相同,但声明为非抛出 1 )。#include< stdio.h>
#include< string>
int test1(){
std :: string s = a; // NOP在这里插入
s + = getchar();
return(int)s.length();
}
int test2()throw(){
std :: string s = a;
s + = getchar();
return(int)s.length();
}
int main()
{
return test1()+ test2();
}
组装:
test1:
。 。 。
呼叫std :: basic_string< char,std :: char_traits< char>,std :: allocator< char> > :: assign
npad 1; nop
调用getchar
。 。 。
test2:
。 。 。
呼叫std :: basic_string< char,std :: char_traits< char>,std :: allocator< char> > :: assign
调用getchar
请注意,MSVS默认使用<$编译c $ c> / EHsc 标志(同步异常处理)。如果没有该标志,则
NOP
s消失,并带有/ EHa
(同步和异步异常处理)),throw()
不再起作用,因为SEH始终处于打开状态。
1 由于某些原因,只有
throw()
似乎使用noexcept <来减小代码大小。 / code>使生成的代码更大,并召唤更多的
NOP
s。 MSVC ...I've compiled the following using Visual Studio C++ 2008 SP1,
x64
C++
compiler:I'm curious, why did compiler add those
nop
instructions after thosecall
s?PS1. I would understand that the 2nd and 3rd
nop
s would be to align the code on a 4 byte margin, but the 1stnop
breaks that assumption.PS2. The C++ code that was compiled had no loops or special optimization stuff in it:
CTestDlg::CTestDlg(CWnd* pParent /*=NULL*/) : CDialog(CTestDlg::IDD, pParent) { m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME); //This makes no sense. I used it to set a debugger breakpoint ::GdiFlush(); srand(::GetTickCount()); }
PS3. Additional Info: First off, thank you everyone for your input.
Here's additional observations:
My first guess was that incremental linking could've had something to do with it. But, the
Release
build settings in theVisual Studio
for the project haveincremental linking
off.This seems to affect
x64
builds only. The same code built asx86
(orWin32
) does not have thosenop
s, even though instructions used are very similar:
- I tried to build it with a newer linker, and even though the
x64
code produced byVS 2013
looks somewhat different, it still adds thosenop
s after somecall
s:
- Also
dynamic
vsstatic
linking to MFC made no difference on presence of thosenop
s. This one is built with dynamical linking to MFC dlls withVS 2013
:
- Also note that those
nop
s can appear afternear
andfar
call
s as well, and they have nothing to do with alignment. Here's a part of the code that I got fromIDA
if I step a little bit further on:
As you see, the
nop
is inserted after afar
call
that happens to "align" the nextlea
instruction on theB
address! That makes no sense if those were added for alignment only.- I was originally inclined to believe that since
near
relative
call
s (i.e. those that start withE8
) are somewhat faster thanfar
call
s (or the ones that start withFF
,15
in this case)
the linker may try to go with
near
call
s first, and since those are one byte shorter thanfar
call
s, if it succeeds, it may pad the remaining space withnop
s at the end. But then the example (5) above kinda defeats this hypothesis.So I still don't have a clear answer to this.
解决方案This is purely a guess, but it might be some kind of a SEH optimization. I say optimization because SEH seems to work fine without the NOPs too. NOP might help speed up unwinding.
In the following example (live demo with VC2017), there is a
NOP
inserted after a call tobasic_string::assign
intest1
but not intest2
(identical but declared as non-throwing1).#include <stdio.h> #include <string> int test1() { std::string s = "a"; // NOP insterted here s += getchar(); return (int)s.length(); } int test2() throw() { std::string s = "a"; s += getchar(); return (int)s.length(); } int main() { return test1() + test2(); }
Assembly:
test1: . . . call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign npad 1 ; nop call getchar . . . test2: . . . call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign call getchar
Note that MSVS compiles by default with the
/EHsc
flag (synchronous exception handling). Without that flag theNOP
s disappear, and with/EHa
(synchronous and asynchronous exception handling),throw()
no longer makes a difference because SEH is always on.
1 For some reason only
throw()
seems to reduce the code size, usingnoexcept
makes the generated code even bigger and summons even moreNOP
s. MSVC...这篇关于为什么64位VC ++编译器在函数调用后添加nop指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- 我尝试使用较新的链接器进行构建,即使