编译器如何确定具有编译器临时生成的函数所需的堆栈大小? [英] How does the compiler determine the needed stack size for a function with compiler generated temporaries?

查看:203
本文介绍了编译器如何确定具有编译器临时生成的函数所需的堆栈大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下代码:

class cFoo {
    private:
        int m1;
        char m2;
    public:
        int doSomething1();
        int doSomething2();
        int doSomething3();
}

class cBar {
    private:
        cFoo mFoo;
    public:
        cFoo getFoo(){ return mFoo; }
}

void some_function_in_the_callstack_hierarchy(cBar aBar) {
    int test1 = aBar.getFoo().doSomething1();
    int test2 = aBar.getFoo().doSomething2();
    ...
}

在调用getFoo()的行中编译器将生成一个cFo​​o临时对象,以便能够调用doSomething1()。
编译器是否重新使用用于这些临时对象的堆栈内存?
some_function_in_the_callstack_hierarchy调用将保留多少堆栈内存?它为每个生成的临时文件保留内存吗?

In the line where getFoo() is called the compiler will generate a temporary object of cFoo, to be able to call doSomething1(). Does the compiler reuse the stack memory which is used for these temporary objects? How many stack memory will the call of "some_function_in_the_callstack_hierarchy" reservate? Does it reservate memory for every generated temporary?

我的猜测是,编译器仅为cFoo的一个对象保留内存,并且将内存重新用于不同的调用,但是如果我添加

My guess was that the compiler only reserve memory for one object of cFoo and will reuse the memory for different calls, but if I add

    int test3 = aBar.getFoo().doSomething3();

我可以看到 some_function_in_the_callstack_hierarchy所需的堆栈大小更多,这不仅是因为

I can see that the needed stack size for "some_function_in_the_callstack_hierarchy" is way more and its not only because of the additional local int variable.

另一方面,如果我然后替换

On the other hand if i then replace

cFoo getFoo(){ return mFoo; }

带有引用(仅用于测试目的,因为将引用返回给私有成员是不好的)

with a reference (Only for testing purpose, because returning a reference to a private member is not good)

const cFoo& getFoo(){ return mFoo; }

它需要的堆栈内存要比一个cFo​​o的大小少。

it needs way less stack memory, than the size of one cFoo.

对于我来说,似乎编译器为函数中每个生成的临时对象保留了额外的堆栈内存。但这将是非常低效的。
有人可以解释吗?

So for me it seems that the compiler reserves extra stack memory for every generated temporary object in the function. But this would be very inefficient. Can someone explain this?

推荐答案

优化编译器正在将您的源代码转换为某种内部表示形式并对其进行规范化。

The optimizing compiler is transforming your source code into some internal representation, and normalizing it.

使用免费软件编译器(例如 GCC & Clang / LLVM ),您可以查看该内部表示形式(至少通过修补编译器代码或在某些调试器中运行它。)

With free software compilers (like GCC & Clang/LLVM), you are able to look into that internal representation (at the very least by patching the compiler code or running it in some debugger).

BTW,有时,临时值甚至不需要任何堆栈空间,例如因为它们已经过优化,或者因为它们可以放在寄存器中。而且他们经常会在当前调用帧中重用一些不必要的时隙。此外(尤其是在C ++中),很多(小的)函数是内联的-就像您的 getFoo 可能是-(因此它们本身没有任何调用框架)。最近的GCC有时甚至可以 进行 tail-call 优化(实际上,

BTW, sometimes, temporary values do not even need any stack space, e.g. because they have been optimized, or because they can sit in registers. And quite often they would reuse some unneeded slot in the current call frame. Also (particularly in C++) a lot of (small) functions are inlined -like your getFoo probably is- (so they don't have any call frame themselves). Recent GCC are even sometimes able of tail-call optimizations (essentially, reusing the caller's call frame).

如果您使用GCC进行编译(即 g ++ ),我建议您使用优化选项开发人员选项(以及其他一些选项)。也许使用 -Wstack-usage = 48 (或其他一些值,以每个调用帧的字节数为单位)和/或 -fstack-usage

If you compile with GCC (i.e. g++) I would suggest to play with optimization options and developer options (and some others). Perhaps use also -Wstack-usage=48 (or some other value, in bytes per call frame) and/or -fstack-usage

首先,如果您可以阅读汇编代码,请使用 g ++ -S -fverbose-asm -O yourcode.cc << c code> yourcode.cc 进行编译。 / code>并查看发出的您的代码。s

First, if you can read assembler code, compile yourcode.cc with g++ -S -fverbose-asm -O yourcode.cc and look into the emitted yourcode.s

(不要忘记要使用优化标志,因此将 -O 替换为 -O2 -O3 ....)

(don't forget to play with optimization flags, so replace -O with -O2 or -O3 ....)

然后,如果您对编译器的优化方式更加好奇,请尝试 g ++ -O -fdump-tree-all -c yourcode.cc ,您会得到很多所谓的转储文件,其中包含内部的 partial 文本渲染

Then, if you are more curious about how the compiler is optimizing, try g++ -O -fdump-tree-all -c yourcode.cc and you'll get a lot of so called "dump files" which contain a partial textual rendering of internal representations relevant to GCC.

如果您更好奇,请查看我的 GCC MELT ,尤其是其文档页(其中包含幻灯片&的很多

If you are even more curious, look into my GCC MELT and notably its documentation page (which contains a lot of slides & references).


所以对我来说,编译器似乎为函数中每个生成的临时对象保留了额外的堆栈内存。

So for me it seems that the compiler reserves extra stack memory for every generated temporary object in the function.

在一般情况下(当然,假设您启用了一些优化)当然不能。即使保留了一些空间,它也将很快被重用。

Certainly not, in the general case (and of course assuming you enable some optimizations). And even if some space is reserved, it would be very quickly reused.

BTW:请注意,C ++ 11标准并不涉及堆栈。可以想象某些不使用任何堆栈的C ++程序就可以编译(例如,整个程序优化可以检测到没有递归的程序,其堆栈空间和布局可以优化以避免任何堆栈。我不知道任何这样的编译器,但是我知道编译器可以很聪明。。。)

这篇关于编译器如何确定具有编译器临时生成的函数所需的堆栈大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆