Antlr:如何匹配其他已识别令牌之间的所有内容? [英] Antlr: how to match everything between the other recognized tokens?
问题描述
如何匹配词法分析器中其他标记之间的所有剩余文本?
How do I match all of the leftover text between the other tokens in my lexer?
这是我的代码:
grammar UserQuery;
expr: expr AND expr
| expr OR expr
| NOT expr
| TEXT+
| '(' expr ')'
;
OR : 'OR';
AND : 'AND';
NOT : 'NOT';
LPAREN : '(';
RPAREN : ')';
TEXT: .+?;
当我在xx AND yy"上运行词法分析器时,我得到这些标记:
When I run the lexer on "xx AND yy", I get these tokens:
x type:TEXT
x type:TEXT
type:TEXT
AND type:'AND'
type:TEXT
y type:TEXT
y type:TEXT
这种工作,除了我不希望每个字符都是一个标记.我想将所有剩余的文本合并为一个 TEXT 标记.
This sort-of works, except that I don't want each character to be a token. I'd like to consolidate all of the leftover text into a single TEXT token.
推荐答案
我认为没有分隔符是不可能的,否则贪婪的 (?) 词法分析器标记将匹配所有您的输入,包括您的显式标记,原则上最长匹配使用词法分析器标记获胜.
I don't think this is possible without a delimiter, otherwise the greedy (?) lexer token will match all your input, including your explicit tokens, on the principle that longest match wins with lexer tokens.
现在,如果您可以接受需要使用分隔符来描述文本,并添加一个简单的空格规则来处理它们之间的空格,那么您会得到如下结果:
Now, if you can accept that a delimiter is needed to delineate the text, and the addition of a simple whitespace rule to handle the spaces in between, then you get something like this:
[@0,0:14=''longest token'',<TEXT>,1:0]
[@1,16:18='AND',<'AND'>,1:16]
[@2,20:23=''yy'',<TEXT>,1:20]
[@3,24:23='<EOF>',<EOF>,1:24]
从这个语法:
grammar UserQuery;
expr: expr AND expr
| expr OR expr
| NOT expr
| TEXT
| '(' expr ')'
;
OR : 'OR';
AND : 'AND';
NOT : 'NOT';
LPAREN : '(';
RPAREN : ')';
TEXT : '\'' .*? '\'';
WS: [ \t\r\n] -> skip;
使用此输入:
'longest token' AND 'yy'
这与编程语言中通常处理注释和字符串的方式非常相似,其中有一个开始和结束分隔符,两者之间的所有内容都被标记为一个大标记.通常对于注释,我们会丢弃它们,但在这里我们将它们保留为字符串.希望这会有所帮助.
It's very similar to the way comments and strings are often handled in programming languages, where there's a starting and ending delimiter and everything in between is tokenized as one big token. Often with comments we'd discard them, but here we keep them as we would a string. Hope this helps.
这篇关于Antlr:如何匹配其他已识别令牌之间的所有内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!