如何使用 ANTLR 修改 CommonTokenStream 中的令牌文本? [英] How can I modify the text of tokens in a CommonTokenStream with ANTLR?

查看:75
本文介绍了如何使用 ANTLR 修改 CommonTokenStream 中的令牌文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习 ANTLR,同时将其用于当前项目.

I'm trying to learn ANTLR and at the same time use it for a current project.

我已经到了可以在一段代码上运行词法分析器并将其输出到 CommonTokenStream 的地步.这工作正常,我已经确认源文本被分解为适当的标记.

I've gotten to the point where I can run the lexer on a chunk of code and output it to a CommonTokenStream. This is working fine, and I've verified that the source text is being broken up into the appropriate tokens.

现在,我希望能够修改此流中某些标记的文本,并显示现在修改后的源代码.

Now, I would like to be able to modify the text of certain tokens in this stream, and display the now modified source code.

例如我试过:

import org.antlr.runtime.*;
import java.util.*;

public class LexerTest
{
    public static final int IDENTIFIER_TYPE = 4;

    public static void main(String[] args)
    {
    String input = "public static void main(String[] args) { int myVar = 0; }";
    CharStream cs = new ANTLRStringStream(input);


        JavaLexer lexer = new JavaLexer(cs);
        CommonTokenStream tokens = new CommonTokenStream();
        tokens.setTokenSource(lexer);

        int size = tokens.size();
        for(int i = 0; i < size; i++)
        {
            Token token = (Token) tokens.get(i);
            if(token.getType() == IDENTIFIER_TYPE)
            {
                token.setText("V");
            }
        }
        System.out.println(tokens.toString());
    }  
}

我正在尝试将所有标识符标记的文本设置为字符串文字V".

I'm trying to set all Identifier token's text to the string literal "V".

  1. 当我调用 tokens.toString() 时,为什么我对令牌文本的更改没有反映出来?

  1. Why are my changes to the token's text not reflected when I call tokens.toString()?

我怎么知道各种令牌类型 ID?我用我的调试器走了一遍,看到 IDENTIFIER 令牌的 ID 是4"(因此我的常量在顶部).但否则我怎么会知道呢?是否有其他方法可以将令牌类型 ID 映射到令牌名称?

How am I suppose to know the various Token Type IDs? I walked through with my debugger and saw that the ID for the IDENTIFIER tokens was "4" (hence my constant at the top). But how would I have known that otherwise? Is there some other way of mapping token type ids to the token name?

<小时>

对我来说很重要的一件事是我希望标记具有其原始的开始和结束字符位置.也就是说,我不希望他们通过将变量名称更改为V"来反映他们的新位置.这样我就知道标记在原始源文本中的位置.

One thing that is important to me is I wish for the tokens to have their original start and end character positions. That is, I don't want them to reflect their new positions with the variable names changed to "V". This is so I know where the tokens were in the original source text.

推荐答案

在 ANTLR 4 中有一个使用解析树侦听器和 TokenStreamRewriter(注意名称差异)的新工具,可用于观察或转换树.(建议 TokenRewriteStream 的回复适用于 ANTLR 3,不适用于 ANTLR 4.)

In ANTLR 4 there is a new facility using parse tree listeners and TokenStreamRewriter (note the name difference) that can be used to observe or transform trees. (The replies suggesting TokenRewriteStream apply to ANTLR 3 and will not work with ANTLR 4.)

在 ANTL4 中,为您生成了一个 XXXBaseListener 类,其中包含用于进入和退出语法中每个非终端节点的回调(例如 enterClassDeclaration() ).

In ANTL4 an XXXBaseListener class is generated for you with callbacks for entering and exiting each non-terminal node in the grammar (e.g. enterClassDeclaration() ).

您可以通过两种方式使用监听器:

You can use the Listener in two ways:

  1. 作为观察者 - 通过简单地覆盖产生与输入文本相关的任意输出的方法 - 例如覆盖 enterClassDeclaration() 并为程序中声明的每个类输出一行.

  1. As an observer - By simply overriding the methods to produce arbitrary output related to the input text - e.g. override enterClassDeclaration() and output a line for each class declared in your program.

作为一个transformer,使用TokenRewriteStream 修改通过的原始文本.为此,您使用重写器在回调方法中进行修改(添加、删除、替换)标记,并使用重写器和结尾输出修改后的文本.

As a transformer using TokenRewriteStream to modify the original text as it passes through. To do this you use the rewriter to make modifications (add, delete, replace) tokens in the callback methods and you use the rewriter and the end to output the modified text.

有关如何进行转换的示例,请参阅 ANTL4 书中的以下示例:

See the following examples from the ANTL4 book for an example of how to do transformations:

https://github.com/mquinn/ANTLR4/blob/master/book_code/tour/InsertSerialIDListener.java

https://github.com/mquinn/ANTLR4/blob/master/book_code/tour/InsertSerialID.java

这篇关于如何使用 ANTLR 修改 CommonTokenStream 中的令牌文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆