antlr 4:所有这些令牌都应该显示在AST中吗? [英] antlr 4: Should all of these tokens be showing up in the AST?

查看：89 发布时间：2020/9/3 0:29:25 parsing antlr antlr4

本文介绍了antlr 4:所有这些令牌都应该显示在AST中吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的最终目标是将结构化文件解析为内存中对象的树，然后可以对其进行操作.我使用的文件格式相当复杂，大约有200个关键字/标签，这似乎是学习解析器/词法分析器框架的一个很好的理由.

My ultimate goal is to parse a structured file as a tree of in-memory objects that I can then manipulate. The file format that I'm using is fairly sophisticated with about 200 keywords/tags, and this seemed like a good reason to learn about parser/lexer frameworks.

不幸的是，有太多的概念(以及成百上千的教程和指南)，到目前为止，学习过程感觉就像是尝试从消防水带喝水.因此，我采取了一些非常微不足道的步骤，从此示例.

Unfortunately, there are so many concepts (and hundreds of tutorials and guides) that the learning process so far feels like trying to drink from a fire hose. So I'm taking some very meager baby steps, starting with this example.

我修改了语法以创建以下测试Nano.g4:

I modified the grammar to create the following test, Nano.g4:

grammar Nano;

r  : root ;
root : START ROOT ID END ROOT;
START : 'StartBlock' ;
END : 'EndBlock' ;
ROOT : 'RootItem' ;
ID : [a-z]+ ;             // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines

接下来，我创建了一个简单的输入文件nano.txt:

Next, I created a simple input file, nano.txt:

StartBlock RootItem
   foo
EndBlock RootItem

然后我使用以下命令加载代码:

I then loaded the code using the following commands:

del *.class
del *.java
java org.antlr.v4.Tool Nano.g4
javac nano*.java
java org.antlr.v4.runtime.misc.TestRig Nano r -gui < nano.txt

这给了我这个结果:

上面的树是我对词法分析器和解析器的期望的第一个概念性的宿醉.为了使输入文件合法，"StartBlock RootItem"和"EndBlock RootItem"标记是必需的，但是从概念上讲，在证明文件格式正确后，我不需要它们.从那时起，我唯一关心的是存在一个包含"foo"的RootItem，如下所示:

The tree above is my first conceptual hangup about what to expect from a lexer and parser. The "StartBlock RootItem" and "EndBlock RootItem" tokens are necessary in terms of making the input file legal, but conceptually I don't need them after I've proven that the file is properly formatted. The only thing that I care about from that point on is that there's a RootItem that contains "foo", as shown here:

再次，我是解析器/词法分析器概念的新手. 应该我(或者甚至有可能)编写语法，以便输出树与上面的图像匹配吗?还是应该在后续遍历AST并仅提取相关数据字段的后续步骤中解决这个问题?

Again, I'm painfully new to parser/lexer concepts. Should I (or, is it even possible to) write the grammar so the output tree matches the image above? Or should I take care of that in some subsequent step that traverses the AST and only extracts the relevant data fields?

antlr 4:所有这些令牌都应该显示在AST中吗? [英] antlr 4: Should all of these tokens be showing up in the AST?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

antlr 4:所有这些令牌都应该显示在AST中吗? [英] antlr 4: Should all of these tokens be showing up in the AST?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭