在字节码编译器中计算跳转地址的智能解决方案? [英] Intelligent solution to computing jump addresses in a bytecode compiler?

查看:139
本文介绍了在字节码编译器中计算跳转地址的智能解决方案?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我正在实现一个字节码编译器,类似于Lua / Python的...等等。

Let's say I'm implementing a bytecode compiler, similar to Lua's/Python's... so on.

我遍历AST,生成字节码指令,我在里面的 if-else 里面碰到 break code> loop:

I'm traversing the AST, generating bytecode instructions, and I run into a break inside of an if-else inside of a while loop:

while (cond1)
    if (cond2)
        ...
    else
        break

(我试着写出等效的字节码,看起来太有用了。)

(I tried writing out equivalent bytecode but it didn't look too helpful.)

关键是,在这个例子中至少有4个跳转指令,而I
想找到一个优雅的解决方案在编译AST时填充跳转地址...
我不知道的跳转地址,而循环或

The point is, there are at least 4 "jump" instructions in that example, and I want to find an elegant solution to filling in the jump addresses as I compile the AST... I don't know the jump address for the while loop or the break until after I've fully "compiled" the inner statements.


  1. 一个伪代码

  2. 一个解决方案不应该取决于我是否实现了一个基于寄存器或基于堆栈的字节码编译器(我在玩两者)

我还没读过这本龙书。

如果我递归编译AST,当我在某些任意数目的循环和 if-else 块中达到 break 语句时,编译器知道要跳到哪个空标签?我猜在某种类型的标签名栈外部的递归AST走功能。

If I'm recursively compiling the AST, when I reach a break statement inside of some arbitrary number of loops and if-else blocks, how should the compiler know which empty label to jump to? I'm guessing some type of label-name stack external to the recursive AST walking function.

推荐答案

backpatching:为正向跳转填充一个虚拟值,为语句的主体生成代码,然后返回并使用结尾处的真实值替换虚拟值。

The principle you need is called "backpatching": fill in a dummy value for the forward jump, generate the code for the body of the statement, and then go back and replace the dummy value with the real one at the end.

例如

# start of codegen for 'while'
L1:
  [evaluate cond1]
  jne L2   # forward reference: use a dummy value for now

# start of codegen for 'if ... else'
L3:
  [evaluate cond2]
  jne L4   # another forward reference: use a dummy value for now
  [code in 'if' branch]
  j L5     # another forward reference: use a dummy value for now
L4:
  [code in 'else' branch before 'break']
  j L2
  [code in 'else' branch after 'break']
L5:   # end of 'if ... else'
  # now go back and fill in references to L4, L5 inside the 'if ... else' block
  # end of codegen for 'if ... else'

  # codegen for 'while' continues...
  j L1   # loop
L2:   # end of 'while' loop
  # now go back and fill in references to L2 inside the 'while' block
  # end of codegen for 'while'






关于您的编辑:


Regarding your edit:


如果我递归编译AST, c $ c> break 语句里面的某些任意数量的循环和 if-else 块,编译器应该如何知道要跳转到哪个空标签?我猜一些类型的标签名栈外部的递归AST走功能。

If I'm recursively compiling the AST, when I reach a break statement inside of some arbitrary number of loops and if-else blocks, how should the compiler know which empty label to jump to? I'm guessing some type of label-name stack external to the recursive AST walking function.

实现 break 语句的跳转目标是最内圈的循环;

The target of the jump which implements the break statement is the end of the innermost enclosing loop; yes, you could track that with an explicit external stack.

但是,如果你有一个递归AST行走函数,你已经有一个隐式 stack - 调用的递归函数调用的框架 - 所以你可能不需要一个显式的。

But, if you have a recursive AST walking function, you already have an implicit stack - the call frames of the recursive function invocations - so you probably don't need an explicit one as well.

例如

...
while (outer)
    ...
    if (outer_done)
        break
    ...

    while (inner)
        ...
        if (inner_done)
            break
        ...
    [break from inner 'while' ends up here]

    ...
    if (outer_done_2)
        break
    ...
[break from outer 'while' ends up here]
...

整个内部 while 循环的代码生成发生在外部的正文的AST的递归遍历中,而循环。在这个递归调用中,你只需要关心内部循环,而不是外部循环。

the entirety of the code generation for the inner while loop happens within a recursive walk of the AST for the body of the outer while loop. Inside this recursive call, you only need to care about the inner loop, not the outer one.

所以,你需要做的是:


  • 在开始处理时保存任何当前的反馈状态,而

  • 开始一个新的空列表,用于跟踪出现在正文中的 break ,而

  • 处理正文(可能导致递归调用)

  • 应用背面

  • ,然后在退出之前恢复先前的状态。

  • save any current backpatching state at the start of processing a while
  • start a new empty list for tracking any break that appears within the body of this while
  • handle the body (which may result in a recursive call)
  • apply the backpatches
  • then restore the previous state before exiting.

例如有点像这样:

codegen for while:
    save previous break backpatch list
    initialise break backpatch list as empty
    perform codegen for evaluating condition
    perform codegen for body statements
    apply backpatches
    restore previous break backpatch list

当前 break backpatch列表必须是传递给所有codegen函数的一些状态的一部分 break 需要附加到它)。但是保存的列表可以作为 codegen函数的局部变量来跟踪。

The current break backpatch list must be part of some state which is passed to all the codegen functions (the codegen for break needs to append to it). But the saved list could just be tracked as a local variable of the while codegen function.

这篇关于在字节码编译器中计算跳转地址的智能解决方案?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆