antlr 4:所有这些令牌都应该出现在 AST 中吗? [英] antlr 4: Should all of these tokens be showing up in the AST?

查看：26 发布时间：2021/11/11 4:10:08 parsing antlr antlr4

本文介绍了antlr 4:所有这些令牌都应该出现在 AST 中吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的最终目标是将结构化文件解析为内存中对象树，然后我可以对其进行操作.我使用的文件格式相当复杂，大约有 200 个关键字/标签，这似乎是学习解析器/词法分析器框架的一个很好的理由.

My ultimate goal is to parse a structured file as a tree of in-memory objects that I can then manipulate. The file format that I'm using is fairly sophisticated with about 200 keywords/tags, and this seemed like a good reason to learn about parser/lexer frameworks.

不幸的是，有太多的概念(以及数百个教程和指南)，到目前为止的学习过程感觉就像试图用消防水管喝水.所以我正在采取一些非常微薄的婴儿步骤，从这个例子.

Unfortunately, there are so many concepts (and hundreds of tutorials and guides) that the learning process so far feels like trying to drink from a fire hose. So I'm taking some very meager baby steps, starting with this example.

我修改了语法以创建以下测试 Nano.g4:

I modified the grammar to create the following test, Nano.g4:

grammar Nano;

r  : root ;
root : START ROOT ID END ROOT;
START : 'StartBlock' ;
END : 'EndBlock' ;
ROOT : 'RootItem' ;
ID : [a-z]+ ;             // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines

接下来，我创建了一个简单的输入文件，nano.txt:

Next, I created a simple input file, nano.txt:

StartBlock RootItem
   foo
EndBlock RootItem

然后我使用以下命令加载代码:

I then loaded the code using the following commands:

del *.class
del *.java
java org.antlr.v4.Tool Nano.g4
javac nano*.java
java org.antlr.v4.runtime.misc.TestRig Nano r -gui < nano.txt

这给了我这个结果:

上面的树是我关于词法分析器和解析器的期望的第一个概念性挂断.就使输入文件合法而言，StartBlock RootItem"和EndBlock RootItem"标记是必要的，但从概念上讲，在我证明文件格式正确后，我不需要它们.从那时起，我唯一关心的是有一个包含foo"的 RootItem，如下所示:

The tree above is my first conceptual hangup about what to expect from a lexer and parser. The "StartBlock RootItem" and "EndBlock RootItem" tokens are necessary in terms of making the input file legal, but conceptually I don't need them after I've proven that the file is properly formatted. The only thing that I care about from that point on is that there's a RootItem that contains "foo", as shown here:

再说一次，我对解析器/词法分析器的概念非常陌生.是否应该我(或者，甚至可能)编写语法以使输出树与上图相匹配?或者我应该在遍历 AST 并只提取相关数据字段的某个后续步骤中处理这个问题?

Again, I'm painfully new to parser/lexer concepts. Should I (or, is it even possible to) write the grammar so the output tree matches the image above? Or should I take care of that in some subsequent step that traverses the AST and only extracts the relevant data fields?

antlr 4:所有这些令牌都应该出现在 AST 中吗? [英] antlr 4: Should all of these tokens be showing up in the AST?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

antlr 4:所有这些令牌都应该出现在 AST 中吗? [英] antlr 4: Should all of these tokens be showing up in the AST?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭