我应该使用堆栈进行长期变量存储吗? [英] Should I use the stack for long-term variable storage?

查看:81
本文介绍了我应该使用堆栈进行长期变量存储吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据汇编的逐步语言"(第3版)第8章短期存储":

According to "Storage for Short Term", Chapter 8 in "Assembly Language Step by Step" (3rd Edition):

应该将堆栈视为短期存放东西的地方.存储在堆栈中的项目没有名称,通常必须以与放置它们相反的顺序将其从堆栈中取出.后进先出,记住. LIFO!

The stack should be considered a place to stash things for the short term. Items stored on the stack have no names, and in general must be taken off the stack in the reverse order in which they were put on. Last in, first out, remember. LIFO!

但是,据我所知,C编译器基本上将堆栈用于所有内容.这是否意味着堆栈是短期和长期存储变量的最佳方法?还是有更好的方法?

However, according to my knowledge, C compilers use the stack for basically everything. Does that mean that the stack is the best way of storing variables, both short term and long term? Or is there a better way?

我能想到的替代方法是:

The alternatives that I can think of are:

  • 堆,但是很慢.
  • 静态变量,但这将持续到程序的整个生命周期,这可能会浪费大量内存.

推荐答案

C编译器基本上将堆栈用于所有内容.并不是真的,有些流行的指令集堆栈很重,因为有或没有很多寄存器.因此,这部分是指令集的设计.明智的编译器设计将具有调用约定,传递参数和返回信息的规则是什么.某些调用约定中,是否在ISA中有很多寄存器,可能会占用大量堆栈,或者可能会使用某些寄存器,然后在有很多参数时依赖堆栈.

C compilers use the stack for basically everything. Well not really, there are some popular instruction sets that are stack heavy because the do or didnt have a lot of registers. So it is in part the design of the instruction set. A sane compiler design is going to have a calling convention, what are the rules for passing in parameters and for returning information. And some of those calling conventions, with a lot of registers in the ISA or not may be stack heavy or may use some registers and then rely on the stack when there are many parameters.

然后,您会了解到程序员在学校所学的知识,比如全局变量是不好的.现在,您已经习惯了使用大量程序员的习惯,此外,函数的概念应该很小,适合于12点字体的打印页面,也可以适合您的屏幕,等等.这将创建大量的函数,这些函数通过许多参数传递越来越多的参数.嵌套函数,有时它是指向嵌套中较高的一个结构的指针,或者是相同的值或一遍又一遍的变体.由于函数嵌套的深度以及使用堆栈传递或存储变量,导致大量过度使用堆栈,某些变量不仅存在很长时间,而且可能存在数十个或数百个该变量的副本.与某种特定的编程语言绝对无关,但部分原因是教育者的观点(在某些情况下,这与简化论文的评分而不一定是编写更好的程序有关)和习惯有关.

Then you get into what programmers are taught in school, that things like globals are bad. Now you have habits of stack heavy programmers, add to that notions of functions should be small, fit on a printed page of 12 point font or fit on your screen, etc. This creates a ton of functions all passing more and more parameters through many nested functions, sometimes it is a pointer to one structure high up in the nesting or the same value or variations of it passed over and over again. Creating a massive overuse of the stack, some variables not only live a very long time there may be dozens or hundreds of copies of that variable due to the depth of the nesting of functions and the use of the stack for passing or storing variables. Has absolutely nothing to do with a particular programming language but in part the educators opinions (that in some cases have to do with making it easier to grade papers and not necessarily making better programs) and habits.

如果您有足够的寄存器,并且允许它们在调用约定中使用,并且您具有优化器,则有时可以大大减少堆栈使用量,程序员仍会习惯于这种习惯,并且仍然可能导致不必要的使用堆栈消耗和无法内联的嵌套仍然可能导致堆栈上的项目重复,或者在程序的整个生命周期中重复存在的结构或保留在堆栈上的项目.

If you have enough registers and you allow their use in the calling convention, and you have an optimizer, you can at times greatly reduce the amount of stack usage, the programmer gets involved here with their habits still and can still cause unnecessary stack consumption, and nesting that cant be inlined can still cause duplicates of items on the stack or structures or items that remain on the stack in place for the entire life of the program.

我喜欢称局部全局变量的全局变量和静态局部变量不在.data堆栈中.有一些程序员会在main()级别上创建变量或结构,这些变量或结构会向下传递到每个嵌套级别,从而消耗了参数传递的消耗,如果这是一个栈密集的调用约定,则可以更有效地使用它,即使使用通过引用传递,您仍在每个级别上燃烧一个指针,在该级别上,静态全局变量本来要便宜得多,而局部全局变量仍然要花与该顶层非静态局部变量相同的费用.您不能简单地声明全局变量或静态局部变量使您付出更多,我认为它们的消耗要少得多,这取决于您的编程习惯和变量选择,如果您为每一个可能的事物创建一个具有新名称的新变量,则可以惹上麻烦.但是例如,当您要进行微控制器工作或其他嵌入式工作,而这些工作对资源非常受限时,例如仅使用全局变量可以为您提供更好的成功机会,您的内存使用率几乎是固定的,您仍然有存储空间可以返回嵌套且不内联的函数的地址.这有点极端,通过实践,您可以使用局部变量,这样很有可能被优化为寄存器而不使用堆栈.对于本地大量使用还是全局大量使用实际上消耗较少的内存,这在很大程度上取决于程序员,处理器和编译器.大量的本地使用可能只是暂时使用,但是对于受约束的系统,确保您不会将堆栈崩溃到程序或堆中所需的分析工作需要做很多工作才能确保安全,添加或删除的每一行代码都可以当局部变量过多时,会对堆栈使用产生重大影响.任何检测堆栈使用情况的方案都会立即消耗您大量的资源,而无需添加任何新的应用程序高级代码即可消耗更多的空间.

Globals and static locals which I like to call local globals are in .data not on the stack. There are programmers that will create variables or structs at the main() level that are passed on down through every level of nesting, costing consumption of the parameter passing which could have been used more efficiently if it is a stack heavy calling convention, even with pass by reference you are still burning a pointer every level, where a static global would have been far cheaper, a local global would have still cost you the same amount as the not static local at that top level. you cant simply state that globals or static locals cost you more, I would argue they are far less consumption, depends on your programming habits and choice of variables, if you create a new variable with a new name for every little possible thing sure you could get into trouble. But for example when you want to do microcontroller work or other embedded work where you are extremely constrained on resources, using only globals for example gives you a far better chance of success, your memory usage is almost fixed, you still have storage for the return address for functions that are nested and dont get inlined. that is a bit extreme, with practice you can use locals that you have a pretty good chance of being optimized away into registers and not use the stack. It is very programmer, processor, and compiler dependent as to whether heavy local use or heavy global use actually consumes less memory. heavy local use has the potential for only being temporary use, but for constrained systems, the analysis required to insure you dont crash the stack into the program or heap takes a lot more work to insure safety, every line of code you add or remove can have dramatic affects on the stack usage when heavy on the local variables. Any scheme to detect stack usage instantly costs you lots of resources burning up more of that space without adding any new application high level code.

现在您正在阅读汇编语言书.不是一本编译器书.编译器程序员的习惯多一点,只能讲受限或受控的说法.为了调试输出并保持理智,您会看到编译器经常在最前面和最后都弄乱了堆栈,基本上是一个堆栈帧.您不会经常看到它们通过函数添加和删除所有东西,从而导致同一项目的偏移量发生变化,或者刻录另一个寄存器作为帧指针,这样您就可以弄乱堆栈中部函数,但在整个函数中会遇到一些局部变量x或传入变量y始终与该堆栈指针或帧指针保持相同的偏移量.汇编语言程序员也可以选择这样做,但是也可以选择只使用堆栈作为相对短期的解决方案.

Now you are reading an assembly language book. Not a compiler book. Compiler programmers habits are a bit more lets say confined or controlled or some other word. In order to debug the output and keep your sanity you see compilers often mess with the stack up front and at the end, a stack frame basically. You dont often see them adding and removing things all through the function causing the offsets to change for the same item, and or burning yet another register as a frame pointer so that you can mess with the stack mid function but throughout the function some local variable x or passed in variable y remains at the same offset to that stack pointer or frame pointer throughout. assembly language programmers may choose to do that too, but may also choose to just use the stack as a relatively short term solution.

因此,以下面的代码为例,该代码被编写为迫使编译器使用堆栈:

So take this for example, code that is written to force the compiler to use the stack:

unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int a )
{
    return(more_fun(a)+a+5);
}

创建

00000000 <fun>:
   0:   e92d4010    push    {r4, lr}
   4:   e1a04000    mov r4, r0
   8:   ebfffffe    bl  0 <more_fun>
   c:   e2844005    add r4, r4, #5
  10:   e0840000    add r0, r4, r0
  14:   e8bd4010    pop {r4, lr}
  18:   e12fff1e    bx  lr

使用堆栈框架方法,在某种程度上,先将寄存器推入堆栈,然后在后端将其释放/恢复.然后使用该寄存器中间函数进行本地存储.这里的调用约定规定必须保留r4,因此保留下一个函数,并保留下面的所有嵌套,以便当我们回到该函数时,r4就是我们留下的样子(r0是该参数进入并返回的结果)在这种情况下)是易失的,每个函数都可以破坏它.

the stack frame approach is used, sort of, up front push a register on the stack, and on the back end free it up/restore it. then use that register mid function for local storage. the calling convention here dictates that r4 has to be preserved, so the next function down preserves and all the nesting below so that when we get back to this function r4 is how we left it (r0 which is what the parameter comes in on and returns in this case) is volatile each function can destroy it.

尽管它违反了您可能拥有的该指令集的当前约定

Although it violates the current convention for this instruction set you could have instead

push {lr}
push {r0}
bl more_fun
add r0,r0,#5
pop {r1}
add r0,r0,r1
pop {lr}
bx lr

是一种比另一种便宜的方法,请确保两个寄存器堆栈push和pop比四个单独的寄存器便宜,对于此指令集,我们无法解决两个加法问题,我们使用了相同数量的寄存器.在这种情况下,编译器的方法是便宜".但是,如果编写的函数不必使用堆栈进行临时存储(取决于指令集),该怎么办

Is one way cheaper than the other, sure the two register stack push and pop is cheaper than four individual ones, for this instruction set we cant get around doing two adds, we use the same number of registers. The compilers approach in this case is "cheaper". But what if a function was written that didnt have to use the stack for temporary storage (depending on the instruction set)

unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int a )
{
    return(more_fun(a)+5);
}

生产 0:e92d4010推送{r4,lr} 4:ebfffffe bl 0 8:e8bd4010 pop {r4,lr} c:e2800005加r0,r0,#5 10:e12fff1e bx lr

producing 0: e92d4010 push {r4, lr} 4: ebfffffe bl 0 8: e8bd4010 pop {r4, lr} c: e2800005 add r0, r0, #5 10: e12fff1e bx lr

然后您告诉我,但是确实如此.好吧,一方面调用约定,另一方面是因为如果总线为64位宽(现在通常是ARM的宽度,或者即使不是),则您要为事务添加一个时钟,而该事务需要为该额外的寄存器占用很多到数百个时钟,成本不高,如果64位宽,那么一次寄存器推入和弹出操作实际上并不会节省您的费用,同样,当您拥有64位宽的总线时,在64位边界上保持对齐也可以节省很多时间.在这种情况下,编译器选择了r4,此处未保留r4,它只是一些寄存器,编译器选择了保持堆栈一致的寄存器,正如您在与此相关的其他stackoverflow问题中所看到的那样,有时编译器会在其中使用r3或其他寄存器.选择r4的情况.

and then you tell me, but it did. Well partly calling convention, and partly because if the bus is 64 bits wide, which it often is for an ARM now, or even if not, you are adding one clock to a transaction that takes many to hundreds of clocks for that additional register, not a big cost, if 64 bits wide then a single register push and pop actually costs you doesnt save you, likewise staying aligned on a 64 bit boundary when you have a 64 bit wide bus, also saves you a lot. The compiler in this case chose r4, r4 is not being preserved here it is simply some register the compiler chose to keep the stack aligned as you see in other stackoverflow questions related to this, sometimes the compiler uses r3, or other registers, in this case it chose r4.

但是除了堆栈对齐和约定(我可以挖掘一个较旧的编译器以显示r4不在lr之外).这段代码不需要在嵌套函数调用之后保留输入参数以进行数学运算,在进入more_fun()之后,可以将变量a丢弃.

But beyond that stack alignment and convention (I could dig up an older compiler to show r4 not being there just lr). This code did not require the input parameter to be preserved for math to be done after the nested function call, after it goes into more_fun() the variable a can be discarded.

作为汇编语言程序员,您可能想尝试使用很多寄存器,我想这取决于指令集和您的习惯,而x86 CISC可以在很多指令中直接使用内存操作数,也许您养成那种习惯,尽管付出了性能代价.但是,如果您努力使用尽可能多的寄存器,最终您将跌入悬崖,使用所有寄存器,并且需要更多寄存器,因此您可以按照书中的指示进行操作

As an assembly language programmer, you are probably wanting to strive to use registers a lot, I guess it depends on the instruction set and your habits an x86 CISC where you can use memory operands directly in a lot of the instructions perhaps you develop a habit of that despite the performance cost. but if you strive to use registers as much as you can you will eventually fall of the cliff and have all the registers used and need one more, so you do what the book is telling you to do

push {r0}
ldr r0,[r2]
ldr r1,[r0]
pop {r0}

或类似的东西用尽了寄存器,需要进行两次间接访问.或者,也许您需要一个中间变量,而您只剩下一个空余的变量,因此您可以临时使用堆栈

or something like that, ran out of registers, needed to do a double indirect. Or maybe you need an intermediate variable and you simply have none left to spare, so you temporarily use the stack

push {r0}
add r0,r1,r2
str r0,[r3]
pop {r0}

在使用编译语言堆栈的情况下,相对于其他一些选择,首先要从处理器设计开始,这是通用寄存器的指令集匮乏,指令集是否按设计将堆栈用于函数调用指令和返回指令以及中断和中断返回还是他们使用寄存器,让您选择是否需要将其保存在堆栈中.指令集是基本迫使您进入堆栈使用还是它是一个选择.他们自己教或发展的下一个编程习惯,可能导致大量使用或减少堆栈使用,过多的函数,太多的嵌套,仅返回地址将在每次调用时占用堆栈上的小字节,从而大量使用局部变量,这可能会引起更多的咀嚼或爆炸,具体取决于函数大小,变量数量(可变大小)和函数中的代码.如果不使用优化器,则将导致大量的堆栈爆炸,您不会像在悬崖顶上那样向功能添加多行,从很少使用到不使用,再到大量使用,这是因为您推动了寄存器通过增加另一条线将其在悬崖上的使用量增加一或多个.未优化的堆栈消耗沉重,但线性度更高.使用寄存器是减少内存消耗的最佳方法,但是在编码和查看编译器输出时需要进行大量练习,并希望下一个编译器以相同的方式工作,它们经常这样做,但有时却不这样做.仍然可以编写代码使内存使用更为保守,并且仍然可以完成任务. (使用较小的变量,例如使用char而不是int不一定会节省您的时间,对于16、32和64位寄存器大小的指令集,有时会花费额外的指令来对扩展寄存器的其余部分进行屏蔽或屏蔽.指令集和您的代码),然后是全局变量,由于某种原因,它们很难被理解?那真是愚蠢.他们有优点也有缺点优点是您的消费受到更多的控制,缺点是,如果您使用很多变量,不重复使用变量则会消耗很多,并且它们在程序的整个生命周期中都存在,他们不会像非静态本地人那样释放自己.静态局部变量只是范围有限的全局变量,仅在您想要全局变量但又不愿为之回避或有非常特殊的原因时才使用它们,其中有一个简短的清单主要与递归相关.

With compiled languages stack use vs some alternative first off starts with the processor design, is the instruction set starved of general purpose registers, does the instruction set use the stack by design for function call instructions and return instructions and interrupts and interrupt returns or do they use a register and let you choose if you need to save it on the stack. Does the instruction set force you into stack usage basically or is it a option. Next programming habits be they taught or developed on your own, can result in heavy or lighter stack use, too many functions, too much nesting, the return addresses alone are going to take little bytes on the stack each call, add heavily local variable use, and that can chew a little more or explode it depending on function size, number of variables (variable size) and the code in the functions. If you dont use the optimizer then you will get massive stack explosion, you dont have a fall of the cliff effect of adding one more line to a function goes from little to no stack use to a lot of stack use, because you pushed the register usage over that cliff, by one or more by adding that one more line. unoptimized the stack consumption is heavy but more linear. Using registers is the best way to reduce memory consumption, but takes lots of practice at coding and looking at the compiler output and hoping that the next compiler works the same-ish way, they often do but sometimes they dont. Still you can write your code to be more conservative of memory use and still get the task done. (using smaller variables like using a char instead of an int DOES NOT necessarily save you, for 16, 32 and 64 bit register sized instruction sets it sometimes costs you extra instructions to sign extend or mask off the rest of the register. Depends on the instruction set and your code) And then there are globals, which for some reason are frowned upon, hard to read? that is silly. They have pros and cons the pros are your consumption is far more controlled, the cons are yes, if you use a lot of variables, dont re-use variables you will consume a lot more, and they are there for the life of the program, they dont free up like non-static locals. static locals are just globals with limited scope, only use them when you are wanting a global but afraid of being shunned for it, or have a very specific reason, which there is a short list mostly related to recursion.

堆速度如何? Ram通常是ram,如果您的变量位于堆栈或堆上,并且需要相同的负载并对其进行存储,则高速缓存会命中和命中,尽管您可以尝试进行操作,但仍然有时会命中有时会命中.有些处理器具有用于堆栈确定的特殊片上内存,但这些不是我们今天看到的通用处理器,这些堆栈通常很小.或某些嵌入式/裸机设计,您可能会将堆栈放在与.data或堆不同的内存中,因为您要使用它并使其具有最快的内存.但是在您正在阅读的机器上安装一个程序,该程序,堆栈和.data/heap可能是相同的慢速dram空间,有些缓存试图使其更快,但并非总是如此. 堆"(无论如何)是编译/操作系统对内存的使用,存在分配和释放的问题,但是一旦分配,其性能与.text和.data以及许多目标平台的堆栈相同,我们使用.使用堆栈,您基本上可以进行malloc和free操作,而开销比进行系统调用少.但是您仍然可以以高效的方式使用堆,就像编译器使用上面的堆栈一样,一条指令可以压入和弹出两件事,从而节省了数十到数十个甚至数百个时钟周期.您可以较少地分配和释放较大的东西.当使用堆栈没有意义时(因为结构或数组或结构数组的大小),人们就会这样做.

How is the heap slow? Ram is ram generally, if your variable is on the stack or on the heap it takes the same loads and stores to get at it, the cache hits and misses, although you can try to manipulate, but still they sometimes hit sometimes miss. Some processors have special on chip ram for stack sure, but those are not the kinds of general purpose processors we see today, those stacks are generally pretty small. Or some embeded/bare metal designs you may put the stack on different ram than the the .data or heap, because you want to use it and have it have the fastest memory. But take a program on the machine you are reading this, the program, the stack and the .data/heap are likely the same slow dram space, with some caching trying to make it faster, but not always. the "heap" which is a compiled/operating system use of memory anyway, has the problem of allocation and freeing, but once allocated then the performance is the same as .text and .data and the stack for a lot of the target platforms we use. using the stack you are basically doing a malloc and free with less overhead than making a system call. But you could still use the heap in an efficient way just like compilers used the stack above, one instruction to push and pop two things, saving several to dozens to hundreds of clock cycles. you could malloc and free larger things less often. And folks do that when it doesnt make sense to use the stack (because of the size of the struct or array or array of structs).

这篇关于我应该使用堆栈进行长期变量存储吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆