Python/YACC:解决转变/减少冲突 [英] Python/YACC: Resolving a shift/reduce conflict
问题描述
我正在使用PLY.这是我来自 parser.out 的状态之一:
I'm using PLY. Here is one of my states from parser.out:
state 3
(5) course_data -> course .
(6) course_data -> course . course_list_tail
(3) or_phrase -> course . OR_CONJ COURSE_NUMBER
(7) course_list_tail -> . , COURSE_NUMBER
(8) course_list_tail -> . , COURSE_NUMBER course_list_tail
! shift/reduce conflict for OR_CONJ resolved as shift
$end reduce using rule 5 (course_data -> course .)
OR_CONJ shift and go to state 7
, shift and go to state 8
! OR_CONJ [ reduce using rule 5 (course_data -> course .) ]
course_list_tail shift and go to state 9
我想将其解析为:
if OR_CONJ is followed by COURSE_NUMBER:
shift and go to state 7
else:
reduce using rule 5 (course_data -> course .)
如何解决我的解析器文件以反映这一点?我需要通过回溯并尝试其他规则来处理语法错误吗?
How can I fix my parser file to reflect this? Do I need to handle a syntax error by backtracking and trying a different rule?
文档说:
然后使用这些值附加一个 数值优先值和 各个方向的关联性 语法规则.这总是 通过查看确定 最右边终端的优先级 符号.
These values are then used to attach a numerical precedence value and associativity direction to each grammar rule. This is always determined by looking at the precedence of the right-most terminal symbol.
如果规则没有终端怎么办?
What if the rule has no terminals?
更新:完整的语法:
Grammar
Rule 0 S' -> statement
Rule 1 statement -> course_data
Rule 2 or_phrase -> statement OR_CONJ statement
Rule 3 or_phrase -> course OR_CONJ COURSE_NUMBER
Rule 4 statement -> or_phrase
Rule 5 course_data -> course
Rule 6 course_data -> course course_list_tail
Rule 7 course_list_tail -> , COURSE_NUMBER
Rule 8 course_list_tail -> , COURSE_NUMBER course_list_tail
Rule 9 course -> DEPT_CODE COURSE_NUMBER
推荐答案
您的基本问题是,需要两个先行标记来完成所需的操作-到目前为止看到的输入是course
,而先行输入是OR_CONJ
,您不知道是将course
减小为course_data
还是进行移位,而无需在OR_CONJ
之后的标记中预找两个标记.您可以通过多种方式处理此问题
Your basic problem is that you need two tokens of lookahead to do what you want -- when the input seen so far is a course
and the lookahead is a OR_CONJ
you don't know whether to reduce the course
to a course_data
or shift without looking ahead two tokens to the token after the OR_CONJ
. There are a number of ways you can deal with this
-
使用LR(2)或LR(k)或GLR解析器生成器-任何人都可以处理.
use an LR(2) or LR(k) or GLR parser generator -- any can deal with this.
使用词法分析器技巧进行前瞻-基本上是让词法分析器根据以下令牌是否为COURSE_NUMBER
来返回两个不同的OR_CONJ
令牌.
use a lexer hack to do the lookahead -- basically have the lexer return two different OR_CONJ
tokens depending on whether the following token is a COURSE_NUMBER
or not.
分解语法以消除冲突,这可能会导致语法解析出与所需内容略有不同的内容(需要进行额外的解析后检查以拒绝某些无效的构造),并且通常会使语法很难理解.
factor the grammar to get rid of the conflict, which may result in a grammar that parses something slightly different from what you want (need some extra post-parse checks to reject some invalid constructs) and will generally make the grammar much harder to understand.
请注意,给定的语法也与模棱两可地关联单个语句中的三个或更多课程有关.通过将语法重写为更清晰的左递归形式,可以轻松解决此问题:
Note that your grammar as given is also ambiguous related to which way three or more courses connected in a single statement associate. This is easily fixed by rewriting the grammar into a clearer left-recursive form:
Rule 1 statement -> course
Rule 2 statement -> statement OR_CONJ course
Rule 3 course -> DEPT_CODE course_list
Rule 4 course -> DEPT CODE course_list OR_CONJ COURSE_NUMBER
Rule 5 course_list -> COURSE_NUMBER
Rule 6 course_list -> course_list , COURSE_NUMBER
对于LL解析器生成器,也可以将其重写为右递归,但是仍然存在2令牌超前问题.重构它以使其消失的一种方法是使COURSE_NUMBER
本身成为有效的course
,并在通过后将其与先前的course
重组(或者如果第一个course
给出错误,则给出错误)在statement
中).然后规则4变为:
This could also be rewritten as right-recursive for an LL parser generator, but it still has the 2-token lookahead problem. One way of refactoring it to make that go away would be to make COURSE_NUMBER
by itself a valid course
and recombine it with the previous course
in a post-pass (or give an error if its the first course
in a statement
). Then rule 4 becomes:
Rule 4 course -> COURSE_NUMBER
,您没有冲突.
这篇关于Python/YACC:解决转变/减少冲突的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!