如何扩展此语法的项目集? [英] How do I expand the item set for this grammar?

查看:142
本文介绍了如何扩展此语法的项目集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个语法

E -> E + i
E -> i

增强语法

E' -> E
E -> E + i
E -> i

现在我尝试扩展项目集0

Now I try to expand the item set 0

I0)
 E' -> .E
+E  -> .E + i
+E  -> .i

然后,因为我的 .E c $ c> I0 我将其扩展,但是随后我将得到另一个 E 规则,依此类推,这是我的第一个疑问。

Then, since I have .E in I0 I would expand it but then I will get another E rule, and so on, this is my first doubt.

假设没问题,下一个项目集是

Assuming that this is alright the next item sets are

I0)
 E' -> .E
+E  -> .E + i
+E  -> .i

I1) (I moved the dot from I0, no variables at rhs of dot)
E' -> E.
E -> E. + i
E -> i.

I2) (I moved the dot from I1, no vars at rhs of dot)
E -> E +. i

I3) (I moved the dot from I2, also no vars)
E -> E + i.

然后我将拥有这个DFA

Then I will have this DFA

I0 -(E, i)-> I1 -(+)-> I2 -(i)-> I3
              |                   |
              +-(∅)-> acpt <-(∅)--+

我遗漏了一些东西,因为 E-> ; E + i 必须接受 i + i + .. ,但是DFA并没有回到I0,所以对我来说似乎是错误的。我的猜测是它应该具有从I0到I0的过渡,但是我不知道该与点有关。

I'm missing something because E -> E + i must accept i + i + .. but the DFA doesn't goes back to the I0, so it seems wrong to me. My guess is that it should have a I0 to I0 transition, but I then I don't know that to do with the dot.

推荐答案

您所说的扩展该项目集实际上是一个闭包;这就是我所见过的所有算法描述中的描述方式(至少在教科书中如此)。像任何关闭操作一样,您只需继续进行转换,直到达到固定点为止。一旦包含了 E 的作品,就将它们包括在内。

What you call the "expansion" of the item set is actually a closure; that's how it's described in all the descriptions of the algorithm I've seen (at least in textbooks). Like any closure operation, you just keep on doing the transformation until you reach a fixed-point; once you've included the productions for E, they're included.

但是重要的是,您并没有建立DFA。您正在建立下推式自动机,而DFA只是其中的一部分。 DFA用于班次操作;当您移动新终端时(因为当前的解析堆栈不是句柄),您将根据DFA进行状态转换。但是,您也可以将当前状态推送到PDA的堆栈上。

But the essential point is that you're not building a DFA. You're building a pushdown automaton, and the DFA is just one part of it. The DFA is used for shift operations; when you shift a new terminal (because the current parse stack is not a handle), you do a state transition according to the DFA. But you also push the current state onto the PDA's stack.

有趣的部分是当自动机决定执行归约操作时,会发生什么,用它替换生产的右侧左侧非终端。 (堆栈顶部的右侧称为句柄。)要进行缩小,请展开堆栈,弹出每个右侧符号(以及相应的DFA状态),直到到达起点为止。生产。这样做是将DFA倒回到从右侧移出第一个符号之前的状态。 (请注意,只有在这一点上您才能确定使用了哪种生产方式。)如此重置DFA之后,您现在可以移动遇到的非终端,进行相应的DFA转换,然后继续进行解析。

The interesting part is what happens when the automaton decides to perform a reduction, which replaces the right-hand side of a production with its left-hand side non-terminal. (The right-hand side at the top of the stack is called a "handle".) To do the reduction, you unwind the stack, popping each right-hand side symbol (and the corresponding DFA state) until you reach the beginning of the production. What that does is rewind the DFA to the state it was in before it shifted the first symbol from the right-hand side. (Note that it is only at this point that you know for sure which production was used.) With the DFA thus reset, you can now shift the non-terminal which was encountered, do the corresponding DFA transition, and continue with the parse.

该过程的基础是以下事实:解析器堆栈始终是可行的前缀。也就是说,是一系列符号,它们是可以从起始符号中得出的某种右义形式的前缀。与上下文无关的语法的可行前缀集有趣的是,它是一种常规语言,因此可以被DFA识别。当句柄被修剪时,以上给出的减少过程精确地表示该识别过程。 (使用Knuth的原始词汇)。

The basis for this procedure is the fact that the parser stack is at all times a "viable prefix"; that is, a sequence of symbols which are the prefix of some right sentential form which can be derived from the start symbol. What's interesting about the set of viable prefixes for a context-free grammar is that it is a regular language, and consequently can be recognised by a DFA. The reduction procedure given above precisely represents this recognition procedure when handles are "pruned" (to use Knuth's original vocabulary).

从这个意义上讲,使用什么过程来确定要修剪哪个句柄并不重要,只要它提供了有效的答案即可。 。例如,您可以在每次发现堆栈顶部有潜在的句柄时就派发解析,并与两个派生并行进行。使用聪明的堆栈管理,可以在最坏情况下的O(n 3 )时间内针对任何无上下文无关的语法进行并行搜索(如果语法不明确,则可以减少此时间)。这是对Earley解析器的非常粗略的描述。

In that sense, it doesn't really matter what procedure is used to determine which handle is to be pruned, as long as it provides a valid answer. You could, for example, fork the parse every time a potential handle is noticed at the top of the stack, and continue in parallel with both forks. With clever stack management, this parallel search can be done in worst-case O(n3) time for any context-free grammar (and this can be reduced if the grammar is not ambiguous). That's a very rough description of Earley parsers.

但是对于LR(k)解析器,我们要求语法必须明确,并且还要求我们可以识别出归约通过从输入流中查看不超过 k 个符号,这是自 k 以来的O(1)操作是固定的。如果在解析的每个点我们都知道是否减少,以及如果选择减少,那么可以按照我上面概述的方法来减少。对于固定的语法,每个缩减都可以在O(1)时间内执行(因为特定语法中右侧的最大大小是固定的),并且由于解析的缩减次数在线性的大小上是线性的输入,整个解析可以在线性时间内完成。

But in the case of an LR(k) parser, we require that the grammar be unambiguous, and we also require that we can identify a reduction by looking at no more than k more symbols from the input stream, which is an O(1) operation since k is fixed. If at each point in the parse we know whether to reduce or not, and if so which reduction to choose, then the reductions can be implemented as I outlined above. Each reduction can be performed in O(1) time for a fixed grammar (since the maximum size of a right-hand side in a particular grammar is fixed), and since the number of reductions in a parse is linear in the size of the input, the entire parse can be done in linear time.

这虽然有点非正式,但我希望它可以作为一种直观的解释。如果您对正式证明感兴趣,那么Donald Knuth最初的1965年论文(从左到右的语言翻译)很容易找到,并且随着这些事情的发展而具有很高的可读性。

That was all a bit informal, but I hope it serves as an intuitive explanation. If you're interested in the formal proof, Donald Knuth's original 1965 paper (On the Translation of Languages from Left to Right) is easy to find and highly readable as these things go.

这篇关于如何扩展此语法的项目集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆