ANTLR4 相当于什么!在词法分析器规则中? [英] What is the ANTLR4 equivalent of a ! in a lexer rule?
问题描述
我正在将旧的 ANTLR 2 语法转换为 ANTLR 4,但在字符串规则方面遇到了问题.
I'm working on converting an old ANTLR 2 grammar to ANTLR 4, and I'm having trouble with the string rule.
STRING :
'\''!
(
~('\'' | '\\' | '\r' | '\n')
)*
'\''!
;
这将创建一个 STRING
标记,其文本包含字符串的内容,但 不包含开始和结束引号,因为 !代码> 引号文字后的符号.
This creates a STRING
token whose text contains the contents of the string, but does not contain the starting and ending quotes, because of the !
symbol after the quote literals.
!
符号上的 ANTLR 4 扼流圈,('!' 对我来说完全出乎意料 (AC0050)
)但是如果我不使用它,我最终带有包含引号的标记,这不是我想要的.将其移植到 ANTLR 4 的正确方法是什么?
ANTLR 4 chokes on the !
symbol, ('!' came as a complete surprise to me (AC0050)
) but if I leave it off, I end up with tokens that contain the quotes, which is not what I want. What's the correct way to port this to ANTLR 4?
推荐答案
Antlr4 通常将标记视为不可变的,至少在不支持语言中立的 !
等价物的意义上是这样.
Antlr4 generally treats tokens as being immutable, at least in the sense that there is no support for a language neutral equivalent of !
.
也许实现等价的最简单方法是:
Perhaps the simplest way to accomplish the equivalent is:
string : str=STRING { Strings.unquote($str); } ;
STRING : SQuote ~[\r\n\\']* SQuote ;
fragment SQuote : '\'' ;
其中 Strings.unquote
是:
public static void unquote(Token token) {
CommonToken ct = (CommonToken) token;
String text = ct.getText();
text = .... unquote it ....
ct.setText(text);
}
使用解析器规则的原因是因为词法分析器(当前)不支持属性引用.不过,它可以在词法分析器规则上完成 - 只需要稍微多花点力气来挖掘令牌.
The reason for using a parser rule is because attribute references are not (currently) supported in the lexer. Still, it could be done on the lexer rule - just would require a slight bit more effort to dig to the token.
修改令牌文本的另一种方法是使用自定义字段和方法实现自定义令牌.如果感兴趣,请参阅此答案.
An alternative to modifying the token text is to implement a custom token with custom fields and methods. See this answer if of interest.
这篇关于ANTLR4 相当于什么!在词法分析器规则中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!