Antlr AST 生成(可能)疯狂 [英] Antlr AST generating (possible) madness
问题描述
以下甚至可能吗?我想反转"给 antlr 的输入并使每个标记成为前一个标记的子标记.
Is the following even possible? I want to "reverse" the input given to antlr and make each token a child of the previous one.
因此,对于输入(假设每个标记由 '.' 字符分隔):
So, for the input (Assume each token is separated by the '.' char) :
Stack.Overflow.Horse
我希望我的语法生成以下 AST:
I would like my grammar to produce the following AST:
Horse
|---Overflow
|---Stack
到目前为止,我已经设法反转节点,但我无法让它们成为彼此的子节点:
So far, I've managed to reverse the nodes, but I'm unable to make them children of each other:
function
: ID PERIOD function
-> function ID
| ID
;
ID : 'a'..'z'*
;
推荐答案
我认为没有什么简单的方法可以做到这一点.你可以这样制定你的规则:
I don't think there's an easy way to do that. You could make your rule like this:
function
: ID PERIOD function
-> ^(function ID)
| ID
;
但这只会使最后一个节点成为根节点,而所有其他节点都成为其子节点.例如,以下来源:
but that only makes the last node the root and all other nodes its children. For example, the following source:
a.b.c.d.e
将产生以下树:
e
/ / \ \
d c b a
我找不到简单的解决方法,因为当您第一次解析 abcde
时,a
将是 ID
和 bcde
对 function
的递归调用:
I can't see an easy fix since when you first parse a.b.c.d.e
, a
will be the ID
and b.c.d.e
the recursive call to function
:
a.b.c.d.e
| +-----+
| |
| `-----> function
|
`----------> ID
导致 b.c.d.e
将 a
作为它的孩子.当 b
成为 ID
时,它也被添加为 a
旁边的子项.在您的情况下, a
应该作为子项删除,然后添加到 b
的子项列表中.但是 AFAIK,这在 ANLTR 中是不可能的(至少,在语法中不是以干净的方式).
resulting in the fact that b.c.d.e
will have a
as its child. When then b
becomes the ID
, it too is added as a child next to a
. In your case, a
should be removed as a child and then added to the list of b
's children. But AFAIK, that is not possible in ANLTR (at least, not in a clean way inside the grammar).
编辑
好的,作为一种变通办法,我想到了一些优雅的东西,但这并没有像我希望的那样奏效.因此,作为一个不太优雅的解决方案,您可以将 last
节点匹配为重写规则中的根:
Okay, as a work-around I had something elegant in mind, but that didn't work as I had hoped. So, as a less elegant solution, you could match the last
node as the root in your rewrite rule:
function
: (id '.')* last=id -> ^($last)
;
然后使用 +=
运算符收集 List
中所有可能的前面节点(children
):
and then collect all possible preceding nodes (children
) in a List
using the +=
operator:
function
: (children+=id '.')* last=id -> ^($last)
;
并在解析器中使用自定义成员方法将这些 children
注入"到树的根部(在 List
中从右到左!):
and use a custom member-method in the parser to "inject" these children
into the root of your tree (going from right to left in your List
!):
function
: (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
;
一个小演示:
grammar ReverseTree;
options {
output=AST;
}
tokens {
ROOT;
}
@members {
private void reverse(List nodes, CommonTree root) {
if(nodes == null) return;
for(int i = nodes.size()-1; i >= 0; i--) {
CommonTree temp = (CommonTree)nodes.get(i);
root.addChild(temp);
root = temp;
}
}
}
parse
: function+ EOF -> ^(ROOT function+)
;
function
: (children+=id '.')* last=id {reverse($children, (CommonTree)$last.tree);} -> ^($last)
;
id
: ID
;
ID
: ('a'..'z' | 'A'..'Z')+
;
Space
: ' ' {skip();}
;
还有一个小测试类:
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
public class Main {
public static void main(String[] args) throws Exception {
ANTLRStringStream in = new ANTLRStringStream("a.b.c.d.e Stack.Overflow.Horse singleNode");
ReverseTreeLexer lexer = new ReverseTreeLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
ReverseTreeParser parser = new ReverseTreeParser(tokens);
ReverseTreeParser.parse_return returnValue = parser.parse();
CommonTree tree = (CommonTree)returnValue.getTree();
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT(tree);
System.out.println(st);
}
}
这将产生一个如下所示的 AST:
which will produce an AST that looks like:
- (使用 http://graph.gafol.net 生成的图像)
- (image generated using http://graph.gafol.net)
对于输入字符串:
"a.b.c.d.e Stack.Overflow.Horse singleNode"
这篇关于Antlr AST 生成(可能)疯狂的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!