如何使用ANTLR生成的语法文件? [英] How to use the grammar files generated by ANTLR?

查看:25
本文介绍了如何使用ANTLR生成的语法文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为这是一个愚蠢的问题,但我刚刚开始使用 ANTLR.我将他们教程中的SimpleCalc"语法放在一起,并使用 C 作为目标语言生成它.我得到 SimpleCalcParser.c/.h 和 SimpleCalcLexer.c/.h 作为输出,我能够编译这些并成功构建.但是现在,我如何实际使用生成的代码?我在文档中找不到任何有用的内容.

I think this is a stupid question, but I'm just starting out with ANTLR. I put together the "SimpleCalc" grammar from their tutorials, and generated it with C as the target language. I got SimpleCalcParser.c/.h and SimpleCalcLexer.c/.h as the output, and I was able to compile these and build successfuly. But now, how do I actually use the code that's generated? I'm having trouble finding anything in the docs that's helpful.

下面是我的 main() 函数.这也是来自教程.

Below is my main() function. This is also from the tutorial.

 #include "SimpleCalcLexer.h"

 int main(int argc, char * argv[])
 {

    pANTLR3_INPUT_STREAM           input;
    pSimpleCalcLexer               lex;
    pANTLR3_COMMON_TOKEN_STREAM    tokens;
    pSimpleCalcParser              parser;

    input  = antlr3AsciiFileStreamNew          ((pANTLR3_UINT8)argv[1]);
    lex    = SimpleCalcLexerNew                (input);
    tokens = antlr3CommonTokenStreamSourceNew  (ANTLR3_SIZE_HINT, TOKENSOURCE(lex));
    parser = SimpleCalcParserNew               (tokens);

    parser  ->expr(parser);

    // Must manually clean up
    //
    parser ->free(parser);
    tokens ->free(tokens);
    lex    ->free(lex);
    input  ->close(input);

    return 0;
 }

根据第一个响应,我应该说我是这样运行程序的:./testantlr test.txt",其中 test.txt 包含4+1".没有输出.

Per the first response, I should say that I ran the program like this: "./testantlr test.txt", where test.txt contained "4+1". There was no output.

从这里开始,例如,我将如何访问生成的语法树中的4",或者打印出整个语法树?基本上,我如何访问 ANTLR 生成的语法树中的内容?

From here, how would I, for example, access the "4" in the generated syntax tree, or print out the entire syntax tree? Basically, how do I access stuff in the syntax tree that ANTLR generates?

推荐答案

当我第一次尝试破解它时,我也面临着同样的困惑.这是一个非常明显的问题/问题,这使得它在教程中似乎没有明确和直接地解决变得更加奇怪.

I faced the same perplexment when I first took a crack at it. It's a pretty obvious question/issue, which makes it more weird that it doesn't seem to be explicitly and straightforwardly addressed in tutorials.

我发现的摆脱困惑的方法是returns"关键字:

The way out of the perplexment that I found is the 'returns' keyword:

token returns [TreeNode value]
    :    WORD { $value = new TreeNode( "word", $WORD.Text ); }
    |    INT { $value = new TreeNode( "int", $INT.Text ); }
    ;

WORD:    ('a'..'z'|'A'..'Z')+;
INT :    ('0'..'9')+;

TreeNode 是我创建的一个类.棘手的地方在于如何用一系列的说,多个令牌来做到这一点.我想出的解决方案是递归:

TreeNode is a class that I made. Where it got tricky was how to do this with a sequence of say, multiple tokens. The solution I came up with was recursion:

expr returns [Accumulator value]
    :   a=token  (WS+ b=expr)?
    {
        if( b != null )
        {
            $value = new Accumulator( "expr", a.value, b.value );
        } else
        {
            $value = new Accumulator( "expr", a.value );
        }
    }
    ;

Accumulator 是我创建的一个类,它有两个不同的构造函数.一个构造函数封装单个令牌,另一个构造函数封装单个令牌和另一个 Accumulator 实例.注意规则本身是递归定义的,并且 b.value 是一个 Accumulator 实例.为什么?因为b是一个expr,expr的定义有返回[Accumulator value].

Accumulator is a class that I made that has two different constructors. One constructor encapsulates a single token, and the other encapsulates a single token and another Accumulator instance. Notice the rule itself is defined recursively, and that b.value is an Accumulator instance. Why? Because b is an expr, and the definition of expr has returns [Accumulator value].

最终生成的树是一个单独的 Accumulator 实例,它将所有令牌分组.要实际使用该树,您需要进行一些设置,然后调用与解析内容的规则同名的方法:

The final resulting tree is a single Accumulator instance that has grouped up all the tokens. To actually use that tree, you do some setup and then call the method that has the same name as the rule with respect to which you're parsing your content:

Antlr.Runtime.ANTLRStringStream stringstream =  new Antlr.Runtime.ANTLRStringStream( script );
TokenLexer lexer = new TokenLexer( stringstream );
Antlr.Runtime.CommonTokenStream tokenstream = new Antlr.Runtime.CommonTokenStream( lexer );
TokenParser parser = new TokenParser( tokenstream );

Accumulator grandtree = parser.expr().value;

希望这对遇到这种困惑的人有所帮助.

Hope this helps people who encounter this perplexion.

由于系统允许您在看似任意模式位置散布目标语言代码,因此有一种更直接的方法可以将项目收集到列表中.成语是:

There's a more straightforward way to collect items into lists, due to how the system allows you to intersperse target-language code at what appears to be arbitrary pattern locations. The idiom is:

sequence returns [String k]
    :   (e=atom { $k = $e.k; })
        (e=atom { $k += ", " + $e.k; })*
        { $k = "sequence (" + $k + ")"; } ;

一个字符串k 被初始化为第一个原子的k 值,随后的原子将+= 初始化为k.片段 $e.k 指的是在别处定义的 atom returns [String k] 规则.如果没有这样的规则,你可以使用 text 属性(即 $e.text,令牌有.我不确定非令牌是否有这个属性.如果没有,你可以这样做:

A string k gets initialized to the k value of the first atom, and subsequent atoms get += to k. The snippet $e.k is referring to a atom returns [String k] rule defined elsewhere. If there isn't such a rule, you can use the text property (i.e. $e.text, which tokens have. I'm not sure if non-tokens have this property. If not, you can just do:

nonToken returns [String whatever] : e=TOKEN { $whatever = $e.text; } ;

然后你会在更高的规则中使用,例如

Which you would then use in higher rules by e.g.

e=nonToken { System.out.println($e.whatever); }

这篇关于如何使用ANTLR生成的语法文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆