ANTLR4-动态注入令牌 [英] ANTLR4- dynamically inject token
问题描述
So I'm writing a python parser and I need to dynamically generate INDENT
and DEDENT
tokens (because python doesn't use explicit delimiters) according to the python grammar specification.
基本上,我有一堆表示缩进级别的整数.在INDENT
令牌中的嵌入式Java操作中,我检查当前的缩进级别是否高于堆栈顶部的级别;如果是的话,我继续进行下去;如果没有,我打电话给skip()
.
Basically I have a stack of integers representing indentation levels. In an embedded Java action in the INDENT
token, I check if the current level of indentation is higher than the level on top of the stack; if it is, I push it on; if not, I call skip()
.
问题是,如果当前的缩进级别与堆栈中的多个级别匹配,则我必须生成多个DEDENT
令牌,而我不知道该怎么做.
The problem is, if the current indentation level matches a level multiple levels down in the stack, I have to generate multiple DEDENT
tokens, and I can't figure out how to do that.
我当前的代码:(请注意,within_indent_block
和current_indent_level
在其他位置进行管理)
My current code: (note that within_indent_block
and current_indent_level
are managed elsewhere)
fragment DENT: {within_indent_block}? (SPACE|TAB)+;
INDENT: {within_indent_block}? DENT
{if(current_indent_level > whitespace_stack.peek().intValue()){
whitespace_stack.push(new Integer(current_indent_level));
within_indent_block = false;
}else{
skip();
}
}
;
DEDENT: {within_indent_block}? DENT
{if(current_indent_level < whitespace_stack.peek().intValue()){
while(current_indent_level < whitespace_stack.peek().intValue()){
whitespace_stack.pop();
<<injectDedentToken()>>; //how do I do this
}
}else{
skip();
}
}
;
我该如何做和/或有更好的方法?
How do I do this and / or is there a better way?
推荐答案
您发布的代码存在一些问题.
There are a few problems with the code you have posted.
-
INDENT
和DEDENT
规则在语义上是相同的(考虑谓词和规则引用,但忽略动作).由于INDENT
首先出现,这意味着您永远都不能使DEDENT
规则产生的标记是此语法. - 在引用
DENT
之前以及在DENT
片段规则本身内部,会出现{within_indent_block}?
谓词.这种重复没有用,但会使您的词法分析器变慢.
- The
INDENT
andDEDENT
rules are semantically identical (considering predicates and rule references, but ignoring actions). SinceINDENT
appears first, this means you can never have a token produced by theDEDENT
rule is this grammar. - The
{within_indent_block}?
predicate appears before you referenceDENT
as well as inside theDENT
fragment rule itself. This duplication serves no purpose but will slow down your lexer.
The actual handling of post-matching actions is best placed in an override of Lexer.nextToken()
. For example, you could start with something like the following.
private final Deque<Token> pendingTokens = new ArrayDeque<>();
@Override
public Token nextToken() {
while (pendingTokens.isEmpty()) {
Token token = super.nextToken();
switch (token.getType()) {
case INDENT:
// handle indent here. to skip this token, simply don't add
// anything to the pendingTokens queue and super.nextToken()
// will be called again.
break;
case DEDENT:
// handle indent here. to skip this token, simply don't add
// anything to the pendingTokens queue and super.nextToken()
// will be called again.
break;
default:
pendingTokens.add(token);
break;
}
}
return pendingTokens.poll();
}
这篇关于ANTLR4-动态注入令牌的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!