使用 ANTLR4 设计灵活语言应用程序的一般策略 [英] General stategy for designing Flexible Language application using ANTLR4

查看:25
本文介绍了使用 ANTLR4 设计灵活语言应用程序的一般策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 antlr4 开发语言应用程序.所讨论的语言并不重要.重要的是语法非常广泛(容易> 2000 条规则!!!).我想做多项操作

I am trying to develop a language application using antlr4. The language in question is not important. The important thing is that the grammar is very vast (easily >2000 rules!!!). I want to do a number of operations

  • 提取大量信息.这些可以是调用图、变量名称.常量表达式等
  • 任意数量的转换:
    • 如果循环可以扩展,我们继续扩展它
    • 如果我们可以消除死代码,我们可能会选择这样做
    • 我们可能会选择重命名所有变量名称以符合某些规范.

    这些操作中的每一个都可以相互独立地应用.在应用这些步骤之后,我希望将输入重写为尽可能接近原始输入.

    Each of these operations can be applied independent of each other. And after application of these steps I want the rewrite the input as close as possible to the original input.

    例如因此,我们可能希望消除循环并重命名变量,然后以原始语言格式输出结果.

    e.g. So we might want to eliminate loops and rename the variable and then output the result in the original language format.

    1. 我认为需要为此构建自定义树(读取 AST).这样我就可以使用每个转换来修改树.但是,当我想生成输出时,我失去了 TokenStreamRewriter 的出色功能.我必须指定如何编写树的每个节点,并且我丢失了我没有进行任何转换的地方的原始输入格式.antlr4 是否提供了解决此问题的好方法?
    2. AST 是最佳选择吗?还是我构建自己的对象表示?如果是这样,我如何有效地创建该对象?对于如此庞大的语言来说,创建对象表示是非常痛苦的.但从长远来看可能会更好.再次如何恢复原始格式?
    3. 是否可以只在解析树上工作?
    4. 是否有类似的语言应用程序可以做同样的事情?如果是这样,他们使用什么策略?
    1. I see a need to build a custom Tree (read AST) for this. So that I can modify the tree with each of the transformations. However when I want to generate the output, I lose the nice abilities of the TokenStreamRewriter. I have to specify how to write each of the nodes of the tree and I lose the original input formatting for the places I didn't do any transformations. Does antlr4 provide a good way to get around this problem?
    2. Is AST the best way to go? Or do I build my own object representation? If so how do I create that object efficiently? Creating object representation is very big pain for such a vast language. But may be better in the long run. Again how do I get back the original formatting?
    3. Is it possible to work just on the parse tree?
    4. Are there similar language applications which do the same thing? If so what strategy do they use?

    欢迎任何输入.提前致谢.

    Any input is welcome. Thanks in advance.

    推荐答案

    一般来说,你想要的是程序转换系统(PTS).

    In general, what you want is called a Program Transformation System (PTS).

    PTS 通常有解析器,构建 AST,可以漂亮地打印 AST 以恢复可编译的源文本.更重要的是,它们具有导航/检查/修改 AST 的标准方法,以便您可以以编程方式更改它们.

    PTSs generally have parsers, build ASTs, can prettyprint the ASTs to recover compilable source text. More importantly, they have standard ways to navigate/inspect/modify the ASTs so that you can change them programmatically.

    许多以模式匹配代码片段的形式提供这些功能,这些代码片段以被转换语言的表面语法编写;这避免了永远不得不知道关于哪些节点在你的 AST 中以及它们如何与子节点相关的极其精细的细节的需要.当您处理复杂的语法时,这非常有用,因为我们的大多数现代(和我们的遗留语言)似乎都有.

    Many offer these capabilities in the form of pattern-matching code fragments written in the surface syntax of the language being transformed; this avoids the need to forever having to know excruciatingly fine details about which nodes are in your AST and how they are related to children. This is incredibly useful when you big complex grammars, as most of our modern (and our legacy languages) all seem to have.

    更复杂的 PTS(很少)提供了额外的工具来梳理源代码的语义.如果不知道单个符号属于什么范围、它们的类型以及许多其他细节(例如数据流),就很难分析/转换大多数代码.完全披露:我构建了其中一个.

    More sophisticated PTSs (very few) provide additional facilities for teasing out the semantics of the source code. It is pretty hard to analyze/transform most code without knowing what scopes individual symbols belong to, or their type, and many other details such as data flow. Full disclosure: I build one of these.

    这篇关于使用 ANTLR4 设计灵活语言应用程序的一般策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆