转译成另一种语言 [英] Transcompiling to another language

查看:24
本文介绍了转译成另一种语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以对代码进行反编译的典型方法是什么?目前,我正在编写一种简单的编程语言,它的处理方式是递归的.循环遍历一个节点列表,假设当前节点是一个变量节点,它会调用一个 emit_variable_node 函数,该函数会将一些代码附加到一个字符串中,例如:

What are the typical ways in which code can be transcompiled? Currently, I'm writing a simple programming language, and the way it's handled is recursively. A list of nodes are looped through, and say the current node is a variable node, it will call a emit_variable_node function, which will literally append some code to a string, for example:

以下代码是伪代码,我正在用 C 编写我的项目,并编译为 C.

The following code is psuedo-ish, I'm writing my project in C, and compiling to C.

char *file_contents;

void emit_variable_node(VariableNode *var) {
    // I know += doesn't work on strings, just pretend it does.
    file_contents += var.getType();
    file_contents += " "; // a space
    file_contents += var.getName();
    // etc
}

我还假设我们给出的代码已经过语义分析,并且是正确的.然后将 file_contents 字符串存储到一个临时文件中,该文件在被 C 编译器编译后会被删除.

I'm also assuming that the code we're given has been semantically analyzed, and is correct. The file_contents string is then stored into a temporary file, which is deleted after it's been compiled by a C compiler.

这是一种不好的做法,还是有更好、更清洁的方法来做到这一点?

Is that a bad practice, or are there better, more cleaner ways to do this?

推荐答案

您可以通过任何您喜欢的方式编写解析器,并在解析时生成代码,无需 AST 节点(语法定向翻译").这通常会产生非常糟糕的代码,因为代码生成器没有机会考虑上下文来生成更好的代码.

You can write a parser by any means you like, and generate code as it parses, no AST nodes necessary ("syntax directed translation"). That will generally produce pretty awful code, because the code generator has no opportunity to take context into account to generate better code.

您可以构建一个解析器,该解析器首先构建抽象语法树 (AST),然后作为第二遍遍历树生成代码,而不查看任何相邻节点.这只是前面带有 AST 的答案.这是一个未经优化的转译器输出的非常糟糕的例子做这样的事情.

You can build a parser that builds abstract syntax trees (ASTs) as a first pass, and then as a second pass walks over the tree generating code without looking at any neighboring nodes. This is just the previous answer with ASTs in it. Here's a stunningly bad example of unoptimized transpiler output done doing something like this.

更好的是从 AST 生成代码,其中每个 AST 节点本地代码生成器检查其邻居,以决定做什么.这将为您提供更好的代码.

Better is to generate code from the AST, where each AST node local code generator inspects its neighbors, to decide what to do. This will give you somewhat better code.

更好的解决方案是效仿传统编译器,为您的语言构建良好的前端,包括符号表以及控制和数据流分析.然后,您可以使用它来生成更好的代码.

A better solution is to follow the lead of conventional compilers, build a good front end for your language, including symbol tables and control and data flow analysis. You can then use this to generate much better code.

关于实际代码生成:是的,您可以打印文本字符串.字符串模板更方便一些,但它们只是打印文本字符串的一种奇特方式,因此它们不会增加任何功能或提高生成的代码质量.

Regarding actual code generation: yes, you can print text strings. String templates are a little more convenient, but they are just a fancy way to print text strings, so they don't add any power or improve the resulting code quality.

更好的解决方案是将源语言中的 AST 转换为目标语言中的 AST,包括所有本地检查以及使用来自符号表和流分析的信息.这样做的好处是,通过在目标语言中生成 AST,您现在可以在目标语言中应用源语言中无法实现的优化.[真正的编译器会做这样的事情,但是他们使用的术语是将 AST 转换为 IR(内部表示)",并且他们对 IR 进行优化.] 在目标 AST 上的所有优化完成后,您必须漂亮地打印最终的 AST... 使用类似字符串模板的东西.

A better solution is to transform ASTs in your source language, into ASTs in your target language, including all the local checks and using information from the symbol table and flow analysis. The nice consequence of this is that by producing ASTs in the target language, you can now apply optimizations in the target language that are not possible in the source language. [Real compilers do something like this, the but terms they use are "translate AST to IR (internal representation)" and they do optimizations on the IR.] After all the optimizations on the target AST are complete, you have to pretty-print the final AST... using something like string templates.

大多数人没有精力从头开始构建一个好的转换器.所以他们做了一些 hacky 的事情,比如第一个建议(只是说).但是,如果您想要将代码从一种语言转换为另一种语言的真正良好基础,请查看我们的 DMS软件再造工具包.DMS 有多种语言的解析器,可以实现自定义语言的解析器,自动构建 AST,为 解析后的生活,例如,构建符号表和流分析,进行 AST 到 AST 的转换,并且有漂亮的打印机.DMS 旨在成为支持此类任务的平台.这意味着您可以专注于构建任务的高质量翻译部分,而不是尝试构建所有有用的基础架构.

Most people don't have the energy to build a good transpiler from scratch. So they do some hacky thing like the first suggestion (just sayin'). But if you want a really good foundation for transforming code from one language to another, check out our DMS Software Reengineering Toolkit. DMS has parsers for many languages, can implement parsers for custom languages, automatically builds ASTs, provides a lot of support for Life After Parsing, e.g., building symbol tables and flow analysis, does AST to AST transformation, and has pretty printers. DMS is designed to be a platform to support this kind of task. What this means is you can concentrate on building the high-quality translation part of the task, rather than trying to build all that useful infrastructure.

这篇关于转译成另一种语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆