如何重写上下文无关的语法,使其成为LR(1)? [英] How do I rewrite a context free grammar so that it is LR(1)?

查看:112
本文介绍了如何重写上下文无关的语法,使其成为LR(1)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于给定的上下文自由语法:

For the given context free grammar:

S -> G $
G -> PG | P
P -> id : R
R -> id R | epsilon

如何重写语法使其为LR(1)?

在解析输入 id:.id(其中。)时,当前语法具有移位/减少冲突。是解析器的输入指针。

此语法产生满足正则表达式(id:(id)*)+

How do I rewrite the grammar so that it is LR(1)?
The current grammar has shift/reduce conflicts when parsing the input "id : .id", where "." is the input pointer for the parser.
This grammar produces the language satisfying the regular expression (id:(id)*)+

推荐答案

很容易为同一语言生成LR(1)语法。诀窍是找到一个具有类似解析树的树,或者至少可以从中轻松地恢复原始解析树。

It's easy enough to produce an LR(1) grammar for the same language. The trick is finding one which has a similar parse tree, or at least from which the original parse tree can be recovered easily.

这里是一个人工生成的语法,略从通用算法简化。实际上,我们重写了正则表达式:

Here's a manually generated grammar, which is slightly simplified from the general algorithm. In effect, we rewrite the regular expression:

(id:id*)+

至:

id(:id+)*:id*

引出语法:

S  → id G $ 
G  → P G | P'
P' → : R'
P  → : R
R' → ε | id R'
R  → ε | id R

这是LALR(1)。

实际上,我们只是将所有作品的令牌向右移,并且有一个通用算法可用于创建 LR(1) k≥ 1 LR(k + 1)语法的语法。 (我使用的该算法的版本来自S. Sippu& E. Soisalon-Soininen,第二卷,第6.7节的解析理论。)

In effect, we've just shifted all the productions one token to the right, and there is a general algorithm which can be used to create an LR(1) grammar from an LR(k+1) grammar for any k≥1. (The version of this algorithm I'm using comes from Parsing Theory by S. Sippu & E. Soisalon-Soininen, Vol II, section 6.7.)

新语法的非结尾形式为(x,V,y)其中 V 是原始语法的符号(终端或非终端),而 x y 是最大长度 k 的终端序列,例如:

The non-terminals of the new grammar will have the form (x, V, y) where V is a symbol from the original grammar (either a terminal or a non-terminal) and x and y are terminal sequences of maximum length k such that:

y ∈ FOLLOWk(V)
x ∈ FIRSTk(Vy)

(长度 y ,因此,如果包括输入末尾,则 x 可能小于 k 在下面的示例中,有些人通过添加 k 结束符号来避免此问题,但我认为此版本非常简单。)

(The lengths of y and consequently x might be less than k if the end of input is included in the follow set. Some people avoid this issue by adding k end symbols, but I think this version is just as simple.)

一个非终结的(x,V,y)将生成 x 的从原始语法的 Vy 派生的字符串。非正式地,整个语法右移 k 标记。每个非终结符都匹配一个字符串,该字符串缺少第一个 k 令牌,但随后的 k 令牌进行了扩充。

A non-terminal (x, V, y) will generate the x-derivative of the strings derived from Vy from the original grammar. Informally, the entire grammar is shifted k tokens to the right; each non-terminal matches a string which is missing the first k tokens but is augmented with the following k tokens.

这些产品是从原始产品中机械生成的。首先,我们添加一个新的开始符号 S'并使用以下产品:

The productions are generated mechanically from the original productions. First, we add a new start symbol, S' with productions:

S' → x (x, S, ε)

x ∈ FIRST k (S)。然后,对于每个产品

for every x ∈ FIRSTk(S). Then, for every production

T → V0 V1 … Vm

我们生成生产集:

(x0,T,xm+1) → (x0,V0,x1) (x1,V1,x2) … (xm,Vm,xm+1)

,并为每个终端 A 生成一组生产

and for every terminal A we generate the set of productions

(Ax,A,xB) → B    if |x| = k
(Ax,A,x) → ε     if |x| ≤ k

由于从新语法的产生式到旧语法的产生式都有明显的同态性,我们可以直接创建原始的解析树,尽管我们需要对语义值进行一些技巧才能将它们正确地附加到解析树上。

Since there is an obvious homomorphism from the productions in the new grammar to the productions in the old grammar, we can directly create the original parse tree, although we need to play some tricks with the semantic values in order to correctly attach them to the parse tree.

这篇关于如何重写上下文无关的语法,使其成为LR(1)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆