ANTLR4-动态注入令牌 [英] ANTLR4- dynamically inject token

查看:112
本文介绍了ANTLR4-动态注入令牌的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我正在编写一个python解析器,我需要根据

So I'm writing a python parser and I need to dynamically generate INDENT and DEDENT tokens (because python doesn't use explicit delimiters) according to the python grammar specification.

基本上,我有一堆表示缩进级别的整数.在INDENT令牌中的嵌入式Java操作中,我检查当前的缩进级别是否高于堆栈顶部的级别;如果是的话,我继续进行下去;如果没有,我打电话给skip().

Basically I have a stack of integers representing indentation levels. In an embedded Java action in the INDENT token, I check if the current level of indentation is higher than the level on top of the stack; if it is, I push it on; if not, I call skip().

问题是,如果当前的缩进级别与堆栈中的多个级别匹配,则我必须生成多个DEDENT令牌,而我不知道该怎么做.

The problem is, if the current indentation level matches a level multiple levels down in the stack, I have to generate multiple DEDENT tokens, and I can't figure out how to do that.

我当前的代码:(请注意,within_indent_blockcurrent_indent_level在其他位置进行管理)

My current code: (note that within_indent_block and current_indent_level are managed elsewhere)

fragment DENT: {within_indent_block}? (SPACE|TAB)+;

INDENT: {within_indent_block}? DENT
        {if(current_indent_level > whitespace_stack.peek().intValue()){
                 whitespace_stack.push(new Integer(current_indent_level));
                 within_indent_block = false;
         }else{
                 skip();
         }
         }
         ;    

DEDENT: {within_indent_block}? DENT
        {if(current_indent_level < whitespace_stack.peek().intValue()){
            while(current_indent_level < whitespace_stack.peek().intValue()){
                      whitespace_stack.pop();
                      <<injectDedentToken()>>; //how do I do this
            }
         }else{
               skip();
         }
         }
         ;

我该如何做和/或有更好的方法?

How do I do this and / or is there a better way?

推荐答案

您发布的代码存在一些问题.

There are a few problems with the code you have posted.

  1. INDENTDEDENT规则在语义上是相同的(考虑谓词和规则引用,但忽略动作).由于INDENT首先出现,这意味着您永远都不能使DEDENT规则产生的标记是此语法.
  2. 在引用DENT之前以及在DENT片段规则本身内部,会出现{within_indent_block}?谓词.这种重复没有用,但会使您的词法分析器变慢.
  1. The INDENT and DEDENT rules are semantically identical (considering predicates and rule references, but ignoring actions). Since INDENT appears first, this means you can never have a token produced by the DEDENT rule is this grammar.
  2. The {within_indent_block}? predicate appears before you reference DENT as well as inside the DENT fragment rule itself. This duplication serves no purpose but will slow down your lexer.

匹配后动作的实际处理最好放在

The actual handling of post-matching actions is best placed in an override of Lexer.nextToken(). For example, you could start with something like the following.

private final Deque<Token> pendingTokens = new ArrayDeque<>();

@Override
public Token nextToken() {
    while (pendingTokens.isEmpty()) {
        Token token = super.nextToken();
        switch (token.getType()) {
        case INDENT:
            // handle indent here. to skip this token, simply don't add
            // anything to the pendingTokens queue and super.nextToken()
            // will be called again.
            break;

        case DEDENT:
            // handle indent here. to skip this token, simply don't add
            // anything to the pendingTokens queue and super.nextToken()
            // will be called again.
            break;

        default:
            pendingTokens.add(token);
            break;
        }
    }

    return pendingTokens.poll();
}

这篇关于ANTLR4-动态注入令牌的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆