相当于ANTLR4是什么!在词法规则中? [英] What is the ANTLR4 equivalent of a ! in a lexer rule?
问题描述
我正在尝试将旧的ANTLR 2语法转换为ANTLR 4,但是在使用字符串规则时遇到了麻烦.
I'm working on converting an old ANTLR 2 grammar to ANTLR 4, and I'm having trouble with the string rule.
STRING :
'\''!
(
~('\'' | '\\' | '\r' | '\n')
)*
'\''!
;
这将创建一个STRING
令牌,该令牌的文本包含字符串的内容,但不包含引号和结束引号,因为引号文字后面是!
符号.
This creates a STRING
token whose text contains the contents of the string, but does not contain the starting and ending quotes, because of the !
symbol after the quote literals.
!
符号('!' came as a complete surprise to me (AC0050)
)上的ANTLR 4扼流圈,但是如果我不使用它,最终会得到包含引号的令牌,这不是我想要的.将其移植到ANTLR 4的正确方法是什么?
ANTLR 4 chokes on the !
symbol, ('!' came as a complete surprise to me (AC0050)
) but if I leave it off, I end up with tokens that contain the quotes, which is not what I want. What's the correct way to port this to ANTLR 4?
推荐答案
Antlr4通常将令牌视为不可变的,至少在某种意义上说,不支持与!
无关的语言.
Antlr4 generally treats tokens as being immutable, at least in the sense that there is no support for a language neutral equivalent of !
.
也许完成等效操作的最简单方法是:
Perhaps the simplest way to accomplish the equivalent is:
string : str=STRING { Strings.unquote($str); } ;
STRING : SQuote ~[\r\n\\']* SQuote ;
fragment SQuote : '\'' ;
其中Strings.unquote
是:
public static void unquote(Token token) {
CommonToken ct = (CommonToken) token;
String text = ct.getText();
text = .... unquote it ....
ct.setText(text);
}
使用解析器规则的原因是因为(当前)词法分析器不支持属性引用.仍然可以按照词法分析器规则进行操作-只需花费一点点精力就可以挖掘令牌.
The reason for using a parser rule is because attribute references are not (currently) supported in the lexer. Still, it could be done on the lexer rule - just would require a slight bit more effort to dig to the token.
修改令牌文本的另一种方法是使用自定义字段和方法来实现自定义令牌.如果有兴趣,请参见此答案.
An alternative to modifying the token text is to implement a custom token with custom fields and methods. See this answer if of interest.
这篇关于相当于ANTLR4是什么!在词法规则中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!