寄存器分配和溢出,简单的方法? [英] Register allocation and spilling, the easy way?

查看:238
本文介绍了寄存器分配和溢出,简单的方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种将局部变量分配给寄存器的方法.我知道有几种严肃的方法可以做到这一点(即在Wikipedia上提到的 ) ,但我对如何完成起毛"感到困惑.而且,相关文献令人生畏.我希望有一些更简单的方法可以满足我的要求:

I'm looking for a way to allocate local variables to registers. I'm aware of a couple of serious methods for doing it (namely, those mentioned on Wikipedia), but I'm stuck on how "spilling" is accomplished. Also, the relevant literature is quite intimidating. I'm hoping there's something simpler that will satisfy my priorities:

  1. 正确性-一种算法,无论有多少局部变量,该算法都会生成正确的代码.
  2. 简单性-无需阅读过多文献就可以理解的东西.
  3. 效率-它需要比当前方法更好,即:

将操作x = y # z转换为:

movl y, %eax
movl z, %ebx
op %ebx, %eax
movl %eax, x

在我针对Intel 386的过程中,一些相关的限制条件是:

As I'm targeting Intel 386, some relevant constraints are:

  • 二进制操作带有两个参数,其中一个是源和目标.一元运算只有一个参数.
  • 操作只能访问一个内存位置;因此,二进制操作在寄存器中至少需要一个参数.
  • 最多有六个寄存器可用:%eax %ebx %ecx %edx %esi %edi. (%ebp也可以作为最后的选择.)
  • 在某些特殊情况下,例如整数除法和返回寄存器,但现在我可以忽略它们.
  • Binary operations take two arguments, one of which is a source and destination. Unary operations take a single argument.
  • Operations can only access one memory location; binary operations therefore need at least one argument in a register.
  • There is a maximum of six registers available: %eax %ebx %ecx %edx %esi %edi. (%ebp could also be included as a last resort.)
  • There are special cases such as for integer division and return registers, but I can ignore them for now.

编译器目前需要完成三个步骤:

There are three steps the compiler gets through at the moment:

  • i386验证:所有操作都转换为a = a # b格式(对于一元操作,则转换为a = #a).
  • 活动性分析:确定每次操作之前和之后的活动变量集.
  • 寄存器分配:建立干涉图并涂上颜色.
  • i386ification: all operations are converted to a form a = a # b (or a = #a for unary operations).
  • Liveness analysis: the sets of live variables before and after each operation are determined.
  • Register allocation: an interference graph is built and coloured.

然后编译器将蜡笔扔向空中,不知道下一步该怎么做.

And then the compiler throws its crayons in the air and doesn't know what to do next.

public int mf(int cr, int ci) {
    int i = 0;
    int zr = 0;
    int zi = 0;

    while (i < 100 && zr*zr + zi*zi < 4) {
        int t = zr * zr - zi * zi + cr;
        zi = 2 * zr * zi + ci;
        zr = t;

        i = i + 1;
    }
    return i;
}

这是该函数的相当漂亮的干扰图,以及带有活动信息的CFG.不幸的是,CFG图像确实需要一些垂直滚动.

Here's the rather pretty interference graph for the function, and the CFG with liveness information. The CFG image does require some vertical scrolling, unfortunately.

  • Interference graph for a function on 14 variables
  • Control-flow graph for a function, with liveness information

使用了七种颜色.我想洒一个(或分配给该颜色的变量集).选择哪种方法不太重要.棘手的是如何处理溢出的变量.

Seven colours were used. I would like to spill one of them (or the set of variables assigned that colour). The method of choosing which isn't too important. What gets tricky is how to deal with the spilt variables.

比方说,我溢出了粉红色",这是变量t$t4$t7的集合.这意味着引用这些变量之一的那些操作将从其在堆栈帧上的位置而不是通过寄存器对其进行访问.此示例适用于此示例.

Let's say I spill "pink", which is the set of variables t, $t4, $t7. This means that those operations referring to one of these variables will access it from its position on the stack frame, rather than through a register. This should work for this example.

但是如果程序为:

...
a = a + b
...

以及ab都必须溢出?我无法发出带有两个内存地址的指令addl b, a.我需要另一个备用寄存器来临时保存其中一个操作数,这意味着会洒出另一种颜色.这表明可以采用以下一般方法:

and both a and b had to be spilled? I can't emit an instruction addl b, a with two memory addresses. I would need another spare register to temporarily hold one of the operands, and that means spilling another colour. This suggests a general method of:

  1. 如果所有变量都可以用r颜色进行着色,那就太好了!
  2. 否则,洒一些颜色及其关联的变量.
  3. 如果存在访问两个溢出变量的操作,则溢出另一种颜色,并将备用寄存器用于所有此类操作的临时存储.
  1. If all variables can be coloured with r colours, great!
  2. Otherwise, spill some colours and their associated variables.
  3. If an operation exists that accesses two spilled variables, spill another colour and use the spare register for temporary storage for all such operations.

在这一点上,我会怀疑溢出的东西多于不必要的东西,并且想知道是否存在一些更聪明的溢出方法,例如,溢出变量生命周期的一部分,而不是整个变量本身.我可以在这里使用一些简单的技术吗?再说一次,我的目标不是特别高-当然还不够高,无法要求阅读任何内容. ;-)

At this point I would suspect that a lot more stuff is being spilled than necessary, and wonder if there is some smarter way to spill things, such as spilling part of a variable's lifetime, rather than the whole variable itself. Are there some simple(ish) techniques that I could use here? Again, I'm not aiming particularly high -- certainly not high enough to require reading anything too deep. ;-)

主要的特定问题是:当变量溢出时,这如何影响生成的指令?使用该变量的所有指令是否都需要直接在内存中访问(从其堆栈位置开始)?如果一个操作使用两个溢出的变量,这将如何工作? (该体系结构不允许指令访问两个不同的内存位置.)

The main specific problem is: when a variable is spilled, how does this affect the instructions generated? Do all instructions using that variable need to access it directly in memory (from its stack position) ? How will this work if an operation uses two spilled variables? (The architecture does not permit instructions to access two distinct memory locations.)

次要问题是:

  • 我如何确定加载/存储指令的插入位置,以确保正确性(而不是次要的效率)?
  • 是否可以在不立即使用变量的情况下仅在变量的整个生命周期内对其进行溢出,并在以后对其进行溢出?这样,所有指令都会作用于未缓存的寄存器.变量可能在不同的时间存在于不同的寄存器中.
  • 在特殊情况下,我可以提高效率吗?例如,%eax用作返回值,因此如果要返回的变量在遇到返回时恰好被分配给了该寄存器,那就太好了.同样,有些寄存器是被调用者保存"的,因此,如果在函数调用时有较少的变量处于活动状态,则将它们分配给非被调用者保存的寄存器将意味着我可以避免存储这些寄存器.
  • SSA表格对您有什么帮助(如果有的话)?能够消除常见的子表达式并评估常数可能会降低寄存器压力,但否则会产生任何效果吗?
  • How do I determine where to insert load/store instructions, for correctness (and less importantly, efficiency) ?
  • Can I spill a variable for only that part of its lifetime when it is not in immediate use, and unspill it later? So that all instructions act on unspilled registers. A variable might live in different registers at different times.
  • Can I be a little more efficient with special cases. For example, %eax is used for the return value, so it would be nice if the variable to be returned happened to be allocated to that register by the time the return was encountered. Similarly, some registers are "callee-save", so if fewer variables happened to be live at the time of a function call, having them allocated to non-callee-save registers would mean I can avoid storing those registers.
  • Would SSA form help much (if at all) ? Being able to eliminate common subexpressions and evaluate constants might reduce(?) register pressure, but otherwise would it have any effect?

我现在不关心的方面是:

The aspects I'm not concerned about (right now) are:

  • 堆栈分配和优化:它已经很幼稚地实现,并且可以根据需要使用干涉图进行优化.
  • 编译时效率,只要终止即可. (NP完整性并不意味着应避免使用给定的算法.)

很抱歉停机时间-我一直在思考给出的答案,并试图找到一种简单的方法来开始实施某些想法.老实说,我一直在拖延...:-\

Sorry about the downtime -- I've been thinking about the answers given and trying to find an easy approach to take to start implementing some of the ideas. To be honest, I've been procrastinating... :-\

我发现了很好的演示文稿(可悲的是,PPT):

I found the very nice presentation (PPT, sadly):

http://www.cs. princeton.edu/courses/archive/spr05/cos320/notes/Register%20Allocation.ppt

哪个回答有关如何处理特定操作需求的问题(例如,将相同的寄存器用于源和目标;或者需要特定的寄存器来进行某些操作).我不确定的是,活泼-消遣-分配"周期是否终止.

Which answers the question about how to deal with specific operation needs (like using the same register for source and destination; or needing a certain register for some operations). What I'm not sure about is whether the Liveness-Colouring-Allocation cycle terminates.

我会尽快做一些实际的工作,希望可以解决这个问题.

I'll try to do some actual work soon and hopefully close the question.

推荐答案

我曾经在JVM分配器中使用过贪婪方法,效果很好.基本上从基本块的顶部开始,所有值都存储在堆栈中.然后只需向前扫描指令,维护包含值的寄存器列表,以及该值是否为脏(需要回写).如果指令使用不在寄存器中(或不在正确寄存器中)的值,则在指令之前发出负载(或移动)以将其放入空闲寄存器中.如果一条指令写了一个值,请确保该值在寄存器中,并在指令后将其标记为脏.

I've used a greedy approach in a JVM allocator once, which worked pretty well. Basically start at the top of a basic block with all values stored on the stack. Then just scan the instructions forward, maintaining a list of registers which contain a value, and whether the value is dirty (needs to be written back). If an instruction uses a value which is not in a register (or not in the correct register), issue a load (or move) to put it in a free register before the instruction. If an instruction writes a value, ensure it is in a register and mark it dirty after the instruction.

如果您需要寄存器,请通过从中分配值来释放已使用的寄存器,如果脏的和有生命的,则将其写入堆栈.在基本块的末尾,写回所有脏的和活动的寄存器.

If you ever need a register, spill a used register by deallocating the value from it, and writing it to the stack if it is dirty and live. At the end of the basic block, write back any dirty and live registers.

此方案可以准确地弄清所有装载/存储的位置,并随即生成它们.它很容易适应于需要在内存中使用值或可以在内存中使用两个参数之一但不能同时使用两个参数的指令.

This scheme makes it clear exactly where all the loads/stores go, you generate them as you go. It is easily adaptable to instructions which take a value in memory, or which can take either of two arguments in memory, but not both.

如果可以在每个基本块边界将所有数据都存储在堆栈中,则此方案效果很好.它应该提供与基本块内的线性扫描相似的结果,因为它基本上完成了非常相似的事情.

If you're OK with having all data on the stack at every basic block boundary, this scheme works pretty well. It should give results similar to linear scan within a basic block, as it basically does very similar things.

对于如何确定要溢出的值和要分配的寄存器,您可能会变得很复杂.前瞻可能会很有用,例如,通过在基本块中的某个点用特定寄存器标记每个值(例如,将eax用作返回值,或将ecx用作移位量),并在值取整时优先使用该寄存器.首先分配(并避免对该寄存器进行其他分配).但是很容易从改进启发式算法中分离出算法的正确性.

You can get arbitrarily complicated about how to decide which values to spill and which registers to allocate. Some lookahead can be useful, for example by marking each value with a specific register it needs to be in at some point in the basic block (e.g. eax for a return value, or ecx for a shift amount) and preferring that register when the value is first allocated (and avoiding that register for other allocations). But it is easy to separate out the correctness of the algorithm from the improvement heuristics.

我已经在SSA编译器YMMV中使用了此分配器.

I've used this allocator in an SSA compiler, YMMV.

这篇关于寄存器分配和溢出,简单的方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆