未解析的AST< O(exp(n))? [英] Unparse AST < O(exp(n))?

查看:125
本文介绍了未解析的AST< O(exp(n))?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

抽象问题描述:

我认为,未解析意味着从AST创建令牌流,再次对其进行解析产生相等的AST。

The way I see it, unparsing means to create a token stream from an AST, which when parsed again produces an equal AST.

所以 parse(unparse(AST))= AST 成立。

这等同于找到可以产生相同AST的有效分析树。

This is the equal to finding a valid parse tree which would produce the same AST.

该语言由上下文无关描述 S属性语法使用 eBNF 变体。

The language is described by a context free S-attributed grammar using a eBNF variant.

因此,解析器必须通过遍历所有语法约束的节点找到有效的路径。这从根本上讲是指为语法生成规则找到有效的 AST 节点的分配。这通常是约束满足问题(CSP),可以像解析一样通过< O(exp(n))中的href = http://en.wikipedia.org/wiki/Backtracking rel = noreferrer>回溯。

So the unparser has to find a valid 'path' through the traversed nodes in which all grammar constraints hold. This bascially means to find a valid allocation of AST nodes to grammar production rules. This is a constraint satisfaction problem (CSP) in general and could be solved, like parsing, by backtracking in O(exp(n)).

幸运的是,可以使用 GLR (或更好地限制语法)。因为 AST 结构非常接近语法生成规则结构,所以看到一个实现真的让我感到惊讶运行时比解析差的地方: XText 使用 ANTLR 用于解析,而回溯则用于未解析。

Fortunately for parsing, this can be done in O(n³) using GLR (or better restricting the grammar). Because the AST structure is so close to the grammar production rule structure, I was really surprised seeing an implementation where the runtime is worse than parsing: XText uses ANTLR for parsing and backtracking for unparsing.

问题


  1. 是上下文无关的S -为语法分配解析器和非解析器需要共享的所有内容,或者存在进一步的约束,例如

  2. 我感觉到这个问题通常不是O(exp(n))-某些天才可以帮助我吗?
  3. li>
  4. 这基本上是上下文相关的语法吗?

  1. Is a context free S-attribute grammar everything a parser and unparser need to share or are there further constraints, e.g. on the parsing technique / parser implementation?
  2. I've got the feeling this problem isn't O(exp(n)) in general - could some genius help me with this?
  3. Is this basically a context-sensitive grammar?

示例1:

Area    returns AnyObject   -> Pedestrian | Highway
Highway returns AnyObject   -> "Foo" Car
Pedestrian  returns AnyObject   -> "Bar" Bike
Car     returns Vehicle     -> anyObjectInstance.name="Car"
Bike    returns Vehicle     -> anyObjectInstance.name="Bike" 

因此,如果我的AST包含

So if I have an AST containing

AnyObject-> AnyObject-> Vehicle [name = Car] ,我知道根可以是Area,我可以将其解析为

AnyObject -> AnyObject -> Vehicle [name="Car"] and I know the root can be Area, I could resolve it to

Area    -> Highway  -> Car

因此,(高速公路|行人)决策取决于子树决策。当叶子乍一看可能是几种类型中的一种,但后来必须是特定的叶子才能形成有效路径时,问题变得更加严重。

So the (Highway | Pedestrian) decision depends on the subtree decisions. The problem get's worse when a leaf might be, at first sight, one of several types, but has to be a specific one to form a valid path later on.

Example2:

因此,如果我有S属性规则返回无类型对象,则只需分配一些属性即可,例如

So if I have S-attribute rules returning untyped objects, just assigning some attributes, e.g.

A -> B C   {Obj, Obj}
X -> Y Z   {Obj, Obj}
B -> "somekeyword"  {0}
Y -> "otherkeyword" {0}
C -> "C" {C}
Z -> "Z" {Z}

因此,如果AST包含

   Obj
  /  \
"0"  "C"

我可以将其解析为

   A
  / \
 B   C

就在我可以将 C解析为C之后。

just after I could resolve "C" to C.

遍历AST时,我可以从语法生成的所有约束都满足A和X两个规则,直到我按下 C。这意味着对于

While traversing the AST, all constraints I can generate from the grammar are satisfied for both rules, A and X, until I hit "C". This means that for

A -> B | C 
B -> "map"  {MagicNumber_42}
C -> "foreach" {MagicNumber_42}

树的两种解决方案

     Obj
      |
 MagicNumber_42

是有效的,并且认为它们具有相同的语义,例如

are valid and it is considered that they have equal semantics ,e.g. syntactic sugar.

其他信息:

解析语法约束,请参见序列化程序:具体语法验证

推荐答案

问题1:不,语法本身可能不足够。以一个模棱两可的语法为例。如果以给定字符串的唯一最左(最右)派生词(AST)结尾,那么您将不得不以某种方式知道解析器如何消除歧义。只需考虑表达式'E:= E + E | E * E | ...'的天真语法的字符串'a + b * c'。

Question 1: no, the grammar itself may not be enough. Take the example of an ambiguous grammar. If you ended up with a unique leftmost (rightmost) derivation (the AST) for a given string, you would somehow have to know how the parser eliminated the ambiguity. Just think of the string 'a+b*c' with the naive grammar for expressions 'E:=E+E|E*E|...'.

问题3:您提供的语法示例都不是上下文相关的。产品的左侧是单个非终端,没有上下文。

Question 3: none of the grammar examples you give is context sensitive. The lefthand-side of the productions are a single non-terminal, there is no context.

这篇关于未解析的AST&lt; O(exp(n))?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆