未解析的 AST <O(exp(n))? [英] Unparse AST < O(exp(n))?
问题描述
抽象问题描述:
在我看来,解析意味着从 AST 创建一个令牌流,当再次解析时会产生一个相等的 AST.
The way I see it, unparsing means to create a token stream from an AST, which when parsed again produces an equal AST.
所以 parse(unparse(AST)) = AST
成立.
这等同于找到一个有效的解析树,它会产生相同的 AST.
This is the equal to finding a valid parse tree which would produce the same AST.
该语言由 无上下文 S-attributed 语法使用 eBNF 变体.
The language is described by a context free S-attributed grammar using a eBNF variant.
因此,解析器必须通过所有语法约束都存在的遍历节点找到有效的路径".这基本上意味着为语法产生式规则找到 AST 节点的有效分配.这是一个约束满足问题 (CSP),可以像解析一样通过回溯,O(exp(n)).
So the unparser has to find a valid 'path' through the traversed nodes in which all grammar constraints hold. This bascially means to find a valid allocation of AST nodes to grammar production rules. This is a constraint satisfaction problem (CSP) in general and could be solved, like parsing, by backtracking in O(exp(n)).
幸运的是,对于解析,这可以使用 GLR(或更好限制语法).因为 AST 结构非常接近语法产生式规则结构,所以看到实现时我真的很惊讶运行时比解析更糟糕的地方:XText 使用 ANTLR 用于解析和回溯以进行解解析.
Fortunately for parsing, this can be done in O(n³) using GLR (or better restricting the grammar). Because the AST structure is so close to the grammar production rule structure, I was really surprised seeing an implementation where the runtime is worse than parsing: XText uses ANTLR for parsing and backtracking for unparsing.
问题
- 是一种上下文无关的 S 属性语法,解析器和解解析器需要共享的所有内容还是有进一步的限制,例如关于解析技术/解析器实现?
- 我感觉这个问题一般来说不是 O(exp(n)) - 有天才可以帮我解决这个问题吗?
- 这基本上是一种上下文相关的语法吗?
示例 1:
Area returns AnyObject -> Pedestrian | Highway
Highway returns AnyObject -> "Foo" Car
Pedestrian returns AnyObject -> "Bar" Bike
Car returns Vehicle -> anyObjectInstance.name="Car"
Bike returns Vehicle -> anyObjectInstance.name="Bike"
所以如果我有一个 AST 包含
So if I have an AST containing
AnyObject ->任何对象 ->Vehicle [name="Car"]
并且我知道根可以是 Area,我可以将其解析为
AnyObject -> AnyObject -> Vehicle [name="Car"]
and I know the root can be Area, I could resolve it to
Area -> Highway -> Car
所以 (Highway | Pedestrian) 的决定取决于子树的决定.如果叶子乍一看可能是几种类型中的一种,但必须是特定的类型才能在以后形成有效路径,则问题会变得更糟.
So the (Highway | Pedestrian) decision depends on the subtree decisions. The problem get's worse when a leaf might be, at first sight, one of several types, but has to be a specific one to form a valid path later on.
示例 2:
因此,如果我有返回无类型对象的 S 属性规则,只需分配一些属性,例如
So if I have S-attribute rules returning untyped objects, just assigning some attributes, e.g.
A -> B C {Obj, Obj}
X -> Y Z {Obj, Obj}
B -> "somekeyword" {0}
Y -> "otherkeyword" {0}
C -> "C" {C}
Z -> "Z" {Z}
所以如果 AST 包含
So if an AST contains
Obj
/
"0" "C"
我可以将其解析为
A
/
B C
就在我可以将C"解析为 C 之后.
just after I could resolve "C" to C.
在遍历 AST 时,我可以从语法生成的所有约束都满足规则 A 和 X,直到我点击C".这意味着对于
While traversing the AST, all constraints I can generate from the grammar are satisfied for both rules, A and X, until I hit "C". This means that for
A -> B | C
B -> "map" {MagicNumber_42}
C -> "foreach" {MagicNumber_42}
树的两种解决方案
Obj
|
MagicNumber_42
是有效的,并且认为它们具有相同的语义,例如语法糖.
are valid and it is considered that they have equal semantics ,e.g. syntactic sugar.
更多信息:
推荐答案
问题 1:不,语法本身可能不够.以歧义语法为例.如果您最终得到给定字符串的唯一最左(最右)派生(AST),您将不得不知道解析器如何消除歧义.想想字符串 'a+b*c' 和表达式 'E:=E+E|E*E|...' 的朴素语法.
Question 1: no, the grammar itself may not be enough. Take the example of an ambiguous grammar. If you ended up with a unique leftmost (rightmost) derivation (the AST) for a given string, you would somehow have to know how the parser eliminated the ambiguity. Just think of the string 'a+b*c' with the naive grammar for expressions 'E:=E+E|E*E|...'.
问题 3:您提供的语法示例都不是上下文相关的.产生式的左侧是单个非终结符,没有上下文.
Question 3: none of the grammar examples you give is context sensitive. The lefthand-side of the productions are a single non-terminal, there is no context.
这篇关于未解析的 AST <O(exp(n))?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!