用Java编写的编译器:Peephole优化器实现 [英] Compiler written in Java: Peephole optimizer implementation

查看:213
本文介绍了用Java编写的编译器:Peephole优化器实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为Pascal的一个子集编写一个编译器。编译器为组装的机器生成机器指令。



Peephole优化器规范



我研究了几种不同的编写窥视孔优化器的方法,我已经采用了后端方法:




  • Encoder在每次生成机器指令时调用 emit()函数。

  • emit(Instruction currentInstr)检查窥视孔优化表:


    • 如果当前指令匹配


      1. 检查先前发出的匹配指令

      2. 如果所有指令都与模式匹配,则应用优化,修改代码存储的尾端


    • 如果未找到优化,则照常发布指令




目前的设计方法



足够了,这是我遇到麻烦的实现。在我的编译器中,机器指令存储在指令类中。我写了一个 InstructionMatch 类存储正则表达式,以匹配机器指令的每个组件。如果模式匹配一​​些机器指令,它的 equals(Instruction instr)方法返回 true instr



但是,我无法完全地应用规则。首先,我觉得,根据我目前的方法,我最终会得到一些乱七八糟的对象。考虑到窥视孔优化数字的完整列表可以编号大约400模式,这将快速失控。



替代方法



我读过的一篇论文将以前的指令折叠成一个长字符串,使用正则表达式进行匹配和替换,并将字符串转换回机器指令。



示例模式,模式语法



  x:JUMP x + 1; x + 1:JUMP y  - > x:JUMP y 
LOADL x; LOADL y; add - > LOADL x + y
LOADA d [r]; STOREI(n) - > STORE(n)d [r]

注意,这些示例模式只是一个人类可读表示以下机器指令模板:

  op_code register nd 

(n通常表示字数,d表示地址位移)。语法 x:< instr> 表示指令存储在代码存储器中的地址 x 因此,指令 LOADL 17 相当于整个机器指令 5 0 0 17



r code>未在此指令中使用)



我的问题



问题是这是:当我需要包括以前的指令的一部分作为变量在我的替换时,如何有效地匹配和替换模式?例如,我可以简单地替换 LOADL 1的所有实例;添加与增量机器指令 - 我不需要任何部分的前面的指示这样做。但我失去了如何有效地使用替换模式中我的第二个例子的'x'和'y'值。



/ strong>:我应该提到一个指令类的每个字段只是一个整数(正常的机器指令)。

解决方案

在模式表中使用'x'或'y'是任何整数值的变量。一个简单的方法是将你的窥视孔优化器实现为一个有限状态机。



我们假设你有一个生成 指令,但不会发出 它们,以及发送实际代码到对象流的 emit 例程。



状态机捕获代码生成器生成的指令,并通过在状态之间转换来记住0或更多生成的指令的序列。因此,状态隐含地记住所生成但未发出的指令的(短)序列;它还必须记住它捕获的指令的关键参数,诸如寄存器名称,常数值和/或寻址模式和抽象目标存储器位置。特殊的开始状态会记住空字符串的指令。在任何时候,你需要能够发出未发出的指令(flush);如果你这样做,所有的时间,你的窥孔生成器捕获下一条指令,然后发出它,没有有用的工作。



要做有用的工作,我们希望机器尽可能多地捕获序列。由于通常有很多种机器指令,实际上你不能记得太多的行或状态机会变得巨大。但是记住最后两个或三个最常见的机器指令(加载,添加,cmp,分支,存储)是实用的。机器的大小实际上由我们所关心的最长窥视孔优化的长度决定,但如果该长度是P,整个机器不需要是P状态深。



每个状态都会根据代码生成器生成的next指令转换到下一个状态。想象一个状态表示N个指令的捕获。
转换选项是:




  • 刷新此状态表示的最左边的0个或更多(称为此k)指令,

  • 刷新该状态表示的最左边的k个指令,转换到状态
    表示剩余的Nk个指令,并重新处理指令I.

  • 完全刷新状态,并发出指令I。



当刷新k条指令时,实际上发射是那些k的窥视孔优化版本。你可以计算任何你想要发出的这样的指令。您还需要记住适当地移动其余指令的参数。



这是很容易实现与窥探优化器状态变量,并在每个代码生成器生成其下一条指令的点。



假设我们的机器是一个增强的堆栈机器,有

  PUSHVAR x 
PUSHK i
ADD
POPVAR x
MOVE x,k

指令,但原始代码生成器仅生成纯堆栈机器指令,例如,它不发出MOV指令。



我们关心的窥视孔是:

  PUSHK i,PUSHK j,ADD ==> PUSHK i + j 
PUSHK i,POPVAR x ==> MOVE x,i

我们的状态变量是:

  PEEPHOLESTATE(枚举符号,初始化为EMPTY)
FIRSTCONSTANT(an int)
SECONDCONSTANT(an int)

我们的案例陈述:

  GeneratePUSHK:
开关(PEEPHOLESTATE){
EMPTY:PEEPHOLESTATE = PUSHK;
FIRSTCONSTANT = K;
break;
PUSHK:PEEPHOLESTATE = PUSHKPUSHK;
SECONDCONSTANT = K;
break;
PUSHKPUSHK:
#IF consumeEmitLoadK // flush state,transition and consume generated instruction
emit(PUSHK,FIRSTCONSTANT);
FIRSTCONSTANT = SECONDCONSTANT;
SECONDCONSTANT = K;
PEEPHOLESTATE = PUSHKPUSHK;
break;
#ELSE //刷新状态,转换和重新处理生成的指令
emit(PUSHK,FIRSTCONSTANT);
FIRSTCONSTANT = SECONDCONSTANT;
PEEPHOLESTATE = PUSHK;
goto GeneratePUSHK; // Java不能这样做,但是其他langauges可以。
#ENDIF
}

GenerateADD:
开关(PEEPHOLESTATE){
EMPTY:emit(ADD);
break;
PUSHK:emit(PUSHK,FIRSTCONSTANT);
emit(ADD);
PEEPHOLESTATE = EMPTY;
break;
PUSHKPUSHK:
PEEPHOLESTATE = PUSHK;
FIRSTCONSTANT + = SECONDCONSTANT;
break:
}

GeneratePOPX:
switch(PEEPHOLESTATE){
EMPTY:emit(POP,X);
break;
PUSHK:emit(MOV,X,FIRSTCONSTANT);
PEEPHOLESTATE = EMPTY;
break;
PUSHKPUSHK:
emit(MOV,X,SECONDCONSTANT);
PEEPHOLESTATE = PUSHK;
break:
}

GeneratePUSHVARX:
开关(PEEPHOLESTATE){
EMPTY:emit(PUSHVAR,X);
break;
PUSHK:emit(PUSHK,FIRSTCONSTANT);
PEEPHOLESTATE = EMPTY;
goto GeneratePUSHVARX;
PUSHKPUSHK:
PEEPHOLESTATE = PUSHK;
emit(PUSHK,FIRSTCONSTANT);
FIRSTCONSTANT = SECONDCONSTANT;
goto GeneratePUSHVARX;
}

#IF显示两种不同的转换样式,一种消耗生成的
指令,一个不指令;或者适用于这个例子。
当你最终有几百个这样的case语句,
你会发现这两种类型方便,与不消费版本帮助
你保持你的代码更小。 / p>

我们需要一个例程来刷新窥视孔优化器:

  flush ){
switch(PEEPHOLESTATE){
EMPTY:break;
PUSHK:emit(PUSHK,FIRSTCONSTANT);
break;
PUSHKPUSHK:
emit(PUSHK,FIRSTCONSTANT),
emit(PUSHK,SECONDCONSTANT),
break:
}
PEEPHOLESTATE = EMPTY;
return; }

有趣的是考虑这个窥视孔优化器对下面的 >代码:

  PUSHK 1 
PUSHK 2
ADD
PUSHK 5
POPVAR X
POPVAR Y

这整个FSA方案将隐藏您的模式匹配状态转换,以及对情况中的匹配模式的响应。你可以手工编码,它是快速,相对容易的代码和调试。但是当case的数量变大时,你不想手动构建这样的状态机。你可以编写一个工具为你生成这个状态机;好的背景这将是FLEX或LALR解析器状态机生成。我不打算在这里解释: - }


I'm writing a compiler for a subset of Pascal. The compiler produces machine instructions for a made-up machine. I want to write a peephole optimizer for this machine language, but I'm having trouble substituting some of the more complicated patterns.

Peephole optimizer specification

I've researched several different approaches to writing a peephole optimizer, and I've settled on a back-end approach:

  • The Encoder makes a call to an emit() function every time a machine instruction is to be generated.
  • emit(Instruction currentInstr) checks a table of peephole optimizations:
    • If the current instruction matches the tail of a pattern:

      1. Check previously emitted instructions for matching
      2. If all instructions matched the pattern, apply the optimization, modifying the tail end of the code store

    • If no optimization was found, emit the instruction as usual

Current design approach

The method is easy enough, it's the implementation I'm having trouble with. In my compiler, machine instructions are stored in an Instruction class. I wrote an InstructionMatch class stores regular expressions meant to match each component of a machine instruction. Its equals(Instruction instr) method returns true if the patterns match some machine instruction instr.

However, I can't manage to fully apply the rules I have. First off, I feel that given my current approach, I'll end up with a mess of needless objects. Given that a complete list of peephole optimizations numbers can number around 400 patterns, this will get out of hand fast. Furthermore, I can't actually get more difficult substitutions working with this approach (see "My question").

Alternate approaches

One paper I've read folds previous instructions into one long string, using regular expressions to match and substitute, and converting the string back to machine instructions. This seemed like a bad approach to me, please correct me if I'm wrong.

Example patterns, pattern syntax

x: JUMP x+1; x+1: JUMP y  -->  x: JUMP y
LOADL x; LOADL y; add     -->  LOADL x+y
LOADA d[r]; STOREI (n)    -->  STORE (n) d[r]

Note that each of these example patterns is just a human-readable representation of the following machine instruction template:

op_code register n d

(n usually indicates the number of words, and d an address displacement). The syntax x: <instr> indicates that the instruction is stored at address x in the code store.

So, the instruction LOADL 17 is equivalent to the full machine instruction 5 0 0 17 when the LOADL opcode is 5 (n and r are unused in this instruction)

My question

So, given that background, my question is this: How do I effectively match and replace patterns when I need to include parts of previous instructions as variables in my replacement? For example, I can simply replace all instances of LOADL 1; add with the increment machine instruction - I don't need any part of the previous instructions to do this. But I'm at a loss of how to effectively use the 'x' and 'y' values of my second example in the substitution pattern.

edit: I should mention that each field of an Instruction class is just an integer (as is normal for machine instructions). Any use of 'x' or 'y' in the pattern table is a variable to stand in for any integer value.

解决方案

An easy way to do this is to implement your peephole optimizer as a finite state machine.

We assume you have a raw code generator that generates instructions but does not emit them, and an emit routine that sends actual code to the object stream.

The state machine captures instructions that your code generator produces, and remembers sequences of 0 or more generated instructions by transitioning between states. A state thus implicitly remembers a (short) sequence of generated but un-emitted instructions; it also has to remember the key parameters of the instructions it has captured, such as a register name, a constant value, and/or addressing modes and abstract target memory locations. A special start state remembers the empty string of instructions. At any moment, you need to be able to emit the unemitted instructions ("flush"); if you do this all the time, your peephole generator captures the next instruction and then emits it, doing no useful work.

To do useful work, we want the machine to capture as long a sequence as possible. Since there are typically many kinds of machine instructions, as practical matter you can't remember too many in a row or the state machine will become enormous. But it is practical to remember the last two or three for the most common machine instructions (load, add, cmp, branch, store). The size of the machine will really be determined by lenght of the longest peephole optimization we care to do, but if that length is P, the entire machine need not be P states deep.

Each state has transitions to a next state based on the "next" instruction I produced by your code generator. Imagine a state represents the capture of N instructions. The transition choices are:

  • flush the leftmost 0 or more (call this k) instructions that this state represents, and transition to a next state, representing N-k+1, instructions that represents the additional capture of machine instruction I.
  • flush the leftmost k instructions this state represents, transition to the state that represents the remaining N-k instructions, and reprocess instruction I.
  • flush the state completely, and emit instruction I, too. [You can actually do this on the just the start state].

When flushing the k instructions, what actually gets emitted is the peephole optimized version of those k. You can compute anything you want in emitting such instructions. You also need to remember "shift" the parameters for the remaining instructions appropriately.

This is all pretty easily implemented with a peephole optimizer state variable, and a case statement at each point where your code generator produces its next instruction. The case statement updates the peephole optimizer state and implements the transition operations.

Assume our machine is an augmented stack machine, has

 PUSHVAR x
 PUSHK i
 ADD
 POPVAR x
 MOVE x,k

instructions, but the raw code generator generates only pure stack machine instructions, e.g., it does not emit the MOV instruction at all. We want the peephole optimizer to do this.

The peephole cases we care about are:

 PUSHK i, PUSHK j, ADD ==> PUSHK i+j
 PUSHK i, POPVAR x ==> MOVE x,i 

Our state variables are:

 PEEPHOLESTATE (an enum symbol, initialized to EMPTY)
 FIRSTCONSTANT (an int)
 SECONDCONSTANT (an int)

Our case statements:

GeneratePUSHK:
    switch (PEEPHOLESTATE) {
        EMPTY: PEEPHOLESTATE=PUSHK;
               FIRSTCONSTANT=K;
               break;
        PUSHK: PEEPHOLESTATE=PUSHKPUSHK;
               SECONDCONSTANT=K;
               break;
        PUSHKPUSHK:
        #IF consumeEmitLoadK // flush state, transition and consume generated instruction
               emit(PUSHK,FIRSTCONSTANT);
               FIRSTCONSTANT=SECONDCONSTANT;
               SECONDCONSTANT=K;
               PEEPHOLESTATE=PUSHKPUSHK;
               break;
        #ELSE // flush state, transition, and reprocess generated instruction
               emit(PUSHK,FIRSTCONSTANT);
               FIRSTCONSTANT=SECONDCONSTANT;
               PEEPHOLESTATE=PUSHK;
               goto GeneratePUSHK;  // Java can't do this, but other langauges can.
        #ENDIF
     }

  GenerateADD:
    switch (PEEPHOLESTATE) {
        EMPTY: emit(ADD);
               break;
        PUSHK: emit(PUSHK,FIRSTCONSTANT);
               emit(ADD);
               PEEPHOLESTATE=EMPTY;
               break;
        PUSHKPUSHK:
               PEEPHOLESTATE=PUSHK;
               FIRSTCONSTANT+=SECONDCONSTANT;
               break:
     }  

  GeneratePOPX:
    switch (PEEPHOLESTATE) {
        EMPTY: emit(POP,X);
               break;
        PUSHK: emit(MOV,X,FIRSTCONSTANT);
               PEEPHOLESTATE=EMPTY;
               break;
        PUSHKPUSHK:
               emit(MOV,X,SECONDCONSTANT);
               PEEPHOLESTATE=PUSHK;
               break:
     }

GeneratePUSHVARX:
    switch (PEEPHOLESTATE) {
        EMPTY: emit(PUSHVAR,X);
               break;
        PUSHK: emit(PUSHK,FIRSTCONSTANT);
               PEEPHOLESTATE=EMPTY;
               goto GeneratePUSHVARX;
        PUSHKPUSHK:
               PEEPHOLESTATE=PUSHK;
               emit(PUSHK,FIRSTCONSTANT);
               FIRSTCONSTANT=SECONDCONSTANT;
               goto GeneratePUSHVARX;
     }

The #IF shows two different styles of transitions, one that consumes the generated instruction, and one that does not; either works for this example. When you end up with a few hundred of these case statements, you'll find both types handy, with the "don't consume" version helping you keep your code smaller.

We need a routine to flush the peephole optimizer:

  flush() {
    switch (PEEPHOLESTATE) {
        EMPTY: break;
        PUSHK: emit(PUSHK,FIRSTCONSTANT);
               break;
        PUSHKPUSHK:
               emit(PUSHK,FIRSTCONSTANT),
               emit(PUSHK,SECONDCONSTANT),
               break:
      }
      PEEPHOLESTATE=EMPTY;
      return; }

It is interesting to consider what this peephole optimizer does with the following generated code:

      PUSHK  1
      PUSHK  2
      ADD
      PUSHK  5
      POPVAR X
      POPVAR Y

What this whole FSA scheme does is hide your pattern matching in the state transitions, and the response to matched patterns in the cases. You can code this by hand, and it is fast and relatively easy to code and debug. But when the number of cases gets large, you don't want to build such a state machine by hand. You can write a tool to generate this state machine for you; good background for this would be FLEX or LALR parser state machine generation. I'm not going to explain this here :-}

这篇关于用Java编写的编译器:Peephole优化器实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆