Antlr Lexer排除特定模式 [英] Antlr Lexer exclude a certain pattern
问题描述
在Antlr Lexer中,如何实现这样的令牌解析:
In Antlr Lexer, How can I achieve parsing a token like this:
一个单词,其中包含任何非空格字母,但其中不包含.{".我能想到的最好的方法是使用语义谓词.
A word that contains any non-space letter but not '.{' inside it. Best I can come up with is using a semantics predicate.
WORD: WL+ {!getText().contains(".{")};
WL: ~[ \n\r\t];
我有点担心使用语义谓词,尽管这里的WORD会被词汇化成数百万次,我认为放一个语义谓词会影响性能.
I'm a bit worried to use semantics predicate though cause WORD here will be lexed millions of times I would think to put a semantics predicate will hit the performance.
这是由于我需要解析类似以下内容的要求:
This is coming from the requirement that I need to parse something like:
TOKEN_ONE.{TOKEN_TWO}
,而TOKEN_ONE可以包含.和{在其字母中.
while TOKEN_ONE can include . and { in its letter.
我正在使用Antlr 4.
I'm using Antlr 4.
推荐答案
您需要将谓词评估限制为紧随输入中.
之后的情况.
You need to limit your predicate evaluation to the case immediately following a .
in the input.
WORD
: ( ~[. \t\r\n]
| '.' {_input.LA(1)!='{'}?
)+
;
这篇关于Antlr Lexer排除特定模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!