如何使用ANTLR4解析嵌套的源文件? [英] How can I parse nested source files with ANTLR4?

查看:81
本文介绍了如何使用ANTLR4解析嵌套的源文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我之前曾问过这个问题(略有不同),但是当时对答案的理解不够,无法给出明智的反馈(叹气).

I've asked this question before (slightly differently) but didn't understand the answers enough at the time to give intelligent feedback (sigh).

我需要能够在任意点处将文件包含在其他文件中,因此我需要能够使用一棵解析树来容纳一堆文件.

I need to be able to include files inside other files at arbitrary points so I need to be able to have a stack of files with a single parse tree.

如果我自己编写此代码(并且我以前已经这样做过),则我的解析器将识别出"Include xyz"或"Import abc",并导致词法分析器暂停从当前文件读取数据,将其推入文件放在堆栈上,然后继续从新文件读取字符,直到用尽为止.

If I was writing this myself (and I have done this in the past), my parser would recognize the "Include xyz" or "Import abc", and would cause the lexer to suspend reading from the current file, push that file on a stack, and continue reading characters from the new file until exhausted.

但是,当使用 ANTLR4 (到目前为止,我避免在语法文件本身中插入任何代码)并使用访问者模式时,我看到的只是创建的树,当然这也是迟了.

However, when using ANTLR4 (where so far I've avoided inserting any code into the grammar file itself) and using the visitor pattern, all I see is the created tree which of course is too late.

我发现可以在词法分析器中完成对PUSHSTREAM的引用,但是我找不到实际的示例,并且会非常感谢一些帮助(或者是我在搜索时可能会错过的实际示例的指针,也可能是短代码样本,如果有人有一个.)

I've found references to PUSHSTREAM as something that can be done in the lexer but I cannot find an actual example and would really appreciate some help (either a pointer to an actual example that I perhaps missed when searching or a short code sample if someone has one).

请注意,我是用C ++而不是Java编写代码.

Note that I'm writing code in C++, not Java.

预先感谢

推荐答案

几年前,我为ANTLR 2.7开发了一个解决方案,以

Years ago I developed a solution for ANTLR 2.7, to parse Windows resource files (*.rc). Such files are structured very much like C/C++ header files and support preprocessor directives like #if/#end/#pragma/#include.

为此,我创建了一个特殊的字符输入流(带有嵌套的char输入流),该流为包含文件实现了基于堆栈的方法.每当在char输入中找到新的include伪指令时,都会使用当前的实际输入流,其位置和行/列信息(以在发现解析问题的情况下提供本地源位置)创建一个新的堆栈条目.该条目将被压入堆栈,并创建一个新的输入流.一旦用尽,TOS就会从堆栈中弹出,并且从最后位置开始继续提供服务的字符(在#include语句之后).该词法分析器只能看到连续的字符流.

For that I created a special character input stream (with a nested char input stream) which implements a stack based approach for include files. Whenever a new include directive is found in the char input a new stack entry is created with the current actual input stream, its position and line/column information (to provide local source locations, in case a parsing problem was found). That entry is pushed onto a stack and a new input stream is created. Once this is exhausted the TOS is popped off the stack and serving chars continued from the last position (after the #include statement). The lexer only sees a continuous stream of characters.

这篇关于如何使用ANTLR4解析嵌套的源文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆