yacc-没有运算符的规则的优先顺序? [英] yacc - Precedence of a rule with no operator?

查看:217
本文介绍了yacc-没有运算符的规则的优先顺序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑使用yacc解析正则表达式(我实际上使用的是PLY),其中一些规则如下所示:

Thinking about parsing regular expressions using yacc (I'm actually using PLY), some of the rules would be like the following:

expr : expr expr
expr : expr '|' expr
expr : expr '*'

问题在于,第一个规则(串联)必须优先于第二个规则,而不是第三个规则.

The problem is, the first rule(concatenation) must take precedence over the second rule, but not the third one.

但是,串联规则中没有运算符.

However, the concatenation rule has no operator in it.

在这种情况下如何正确指定优先级?

How can I specify the precedence correctly in this case?

谢谢!

我修改了规则以避免出现此问题,但我仍然好奇问题出在什么地方.

I modified the rules to avoid the issue, but I'm still curious what was the problem.

这是源代码:

tokens = ['PLEFT', 'PRIGHT', 'BAR', 'ASTERISK', 'NORMAL']

t_PLEFT = r'\('
t_PRIGHT = r'\)'
t_BAR = r'\|'
t_ASTERISK = '\*'
t_NORMAL = r'[a-zA-Z0-9]'

lex.lex()

precedence = (
  ('left', 'BAR'),
  ('left', 'CONCAT'),
  ('left', 'ASTERISK'),
)

def p_normal(p):
    '''expr : NORMAL'''
    p[0] = p[1]

def p_par(p):
    '''expr : PLEFT expr PRIGHT'''
    p[0] = p[2]

def p_or(p):
    '''expr : expr BAR expr'''
    p[0] = ('|', p[1], p[3])

def p_concat(p):
    '''expr : expr expr %prec CONCAT'''
    p[0] = ('CONCAT', p[1], p[2])

def p_repeat(p):
    '''expr : expr ASTERISK'''
    p[0] = ('*', p[1])

yacc.yacc()

它对'ab|cd*'的解析结果是('CONCAT', ('|', ('CONCAT', 'a', 'b'), 'c'), ('*', 'd')).

推荐答案

您没有义务使用优先权来消除歧义;您只需编写明确的语法即可:

You are under no obligation to use precedence to disambiguate; you can simply write an unambiguous grammar:

term : CHAR | '(' expr ')'
rept : term | term '*' | term '+' | term '?'
conc : rept | conc rept
expr : conc | expr '|' conc

如果您真的想使用优先级,则可以使用带有%prec批注的虚拟"令牌.有关详细信息,请参见手册. (此功能来自yacc,因此您也可以在任何yacc/bison文档中阅读它.)

If you really want to use precedence, you can use a "fictitious" token with a %prec annotation. See the manual for details. (This feature comes from yacc, so you could read about it in any yacc/bison documentation as well.)

请记住,优先级始终是生产(在解析器堆栈的顶部)和超前标记之间的比较.通常,生产的优先级取自生产中最后一个终端的优先级(通常,每个适用的生产中只有一个终端),因此似乎是终端之间的比较.但是,为了获得使用不可见"运算符的优先级,您需要分别考虑生产优先级和前瞻性令牌优先级.

Bear in mind that precedence is always a comparison between a production (at the top of the parser stack) and the lookahead token. Normally, the precedence of productions is taken from the precedence of the last terminal in the production (and normally there is only one terminal in each applicable production), so it appears to be a comparison between terminals. But in order to get precedence to work with "invisible" operators, you need to separately consider both the production precedence and the lookahead token precedence.

如上所述,可以使用虚拟"令牌设置生产的优先级.但是没有对应于不可见运算符的超前标记.前瞻标记将是以下操作数中的第一个标记.换句话说,它可以是expr FIRST 集中的任何令牌,在本例中为{NORMAL, PRIGHT};必须将此集合添加到优先级声明中,就像它们是串联运算符一样:

The precedence of the production can be set with a "fictitious" token, as described above. But there is no lookahead token corresponding to an invisible operator; the lookahead token will be the first token in the following operand. In other words, it could be any token in the FIRST set of expr, which in this case is {NORMAL, PRIGHT}; this set must be added to the precedence declaration as though they were concatenation operators:

precedence = (
  ('left', 'BAR'),
  ('left', 'CONCAT', 'NORMAL', 'PLEFT'),
  ('left', 'ASTERISK'),
)

一旦这样做,您就可以节省虚拟的CONCAT令牌,因为您可以使用任何FIRST(expr)令牌作为代理,但这可能被认为可读性较低.

Once you do that, you could economize on the fictitious CONCAT token, since you could use any of the FIRST(expr) tokens as a proxy, but that might be considered less readable.

这篇关于yacc-没有运算符的规则的优先顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆