Antlr(词法分析器):匹配正确的标记 [英] Antlr (lexer): matching the right token

查看:47
本文介绍了Antlr(词法分析器):匹配正确的标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的 Antlr3 语法中,我有几个重叠"的词法分析器规则,如下所示:

In my Antlr3 grammar, I have several "overlapping" lexer rules, like this:

NAT: ('0' .. '9')+ ;
INT: ('+' | '-')? ('0' .. '9')+ ;
BITVECTOR: ('0' | '1')* ;

尽管像 100110123 之类的标记可以与多个规则匹配,但始终由上下文决定必须匹配其中的哪一个.示例:

Although tokens like 100110 and 123 can be matched by more than one of those rules, it is always determined by context which of them it has to be. Example:

s: a | b | c ;
a: '<' NAT '>' ;
b: '{' INT '}' ;
c: '[' BITVECTOR ']' ;

输入 {17} 然后应该匹配 {INT},但词法分析器有已经决定 17 是一个 NAT 令牌.我怎样才能防止这种行为?backtrack 选项已经设置为 true,但它似乎只影响解析器规则.

The input {17} should then match {, INT, and }, but the lexer has already decided that 17 is a NAT-token. How can I prevent this behavior? The backtrack option is already set to true, but it only seems to affect parser rules.

推荐答案

可能有一种复杂的方法可以使词法分析器对上下文敏感,但通常这就是您希望解析器处理的事情,并且您希望您的词法分析器只提供令牌流.我的建议是重构您的词法分析器以返回 DIGITSSIGN 并让您的解析器计算出上下文中数字代表的数字类型.

There might be a complex way to make the lexer context-sensitive, but in general that's what you want the parser to take care of, and you want your lexer to just provide a stream of tokens. My recommendation is to refactor your lexer to return DIGITS and SIGN and let your parser work out what kind of number the digits represent by the context.

这篇关于Antlr(词法分析器):匹配正确的标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆