LR(0)和SLR分析之间的区别是什么? [英] What is the difference between LR(0) and SLR parsing?

查看:1474
本文介绍了LR(0)和SLR分析之间的区别是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的编译器的概念,但是工作的我有点迷茫...... 谷歌搜索让我无处一个明确的答复。

I am working on my compilers concepts however I am a little confused... Googling got me nowhere to a definite answer.

是SLR和LR(0)分析器一个相同的?如果没有,什么区别?

Is SLR and LR(0) parsers one and same? If not, whats the difference?

推荐答案

这两个LR(0)和SLR(1)语法分析器的自下而上,定向,predictive解析器的。这意味着

Both LR(0) and SLR(1) parsers are bottom-up, directional, predictive parsers. This means that

  • 的解析器尝试应用制作反向减少输入句子回到开始符号(自下而上的)
  • 的解析器从扫描输入左到右(定向的)
  • 的解析器试图predict而不必看到输入的所有申请什么减少( predictive 的)
  • The parsers attempt to apply productions in reverse to reduce the input sentence back to the start symbol (bottom-up)
  • The parsers scan the input from left-to-right (directional)
  • The parsers attempt to predict what reductions to apply without necessarily seeing all of the input (predictive)

这两个LR(0)和SLR(1)移/通过将它们放在一摞减少解析器,这意味着它们处理输入数据流的标记,并在每个点要么<强>通过推入堆栈或移标记的减少终端和终结符号栈找回一些非终结符上面的一些序列。它可以表明,任何语法可以使用​​移位/减少解析器解析自下而上,但是分析器可能不确定性的。也就是说,解析器可能要猜测是否把移位或减少,并可能最终不得不回溯到认识到,它做出了错误的选择。无论多么强大的确定性移进/归约解析器您构建,它将永远无法解析所有的语法。

Both LR(0) and SLR(1) are shift/reduce parsers, meaning that they process the tokens of the input stream by placing them on a stack, and at each point either shifting a token by pushing it onto the stack or reducing some sequence of terminals and nonterminals atop the stack back to some nonterminal symbol. It can be shown that any grammar can be parsed bottom-up using a shift/reduce parser, but that parser might not be deterministic. That is, the parser may have to "guess" whether to apply a shift or reduction, and may end up having to backtrack to realize that it made the wrong choice. No matter how powerful a deterministic shift/reduce parser you construct, it will never be able to parse all grammars.

在一个确定的移进/归约分析器用于分析语法,它不能处理,它导致的转变/减少冲突减少/减少冲突,其中解析器可以进入一种状态,它不能告诉采取什么样的行动。在移进/归约冲突,它不能告诉它是否应该添加一个符号栈或执行上的堆栈顶部的符号有所减少。在一个降低/减少冲突,解析器知道它需要与一些非终结替换堆栈顶部的符号,但它不能告诉使用什么还原

When a deterministic shift/reduce parser is used to parse a grammar that it cannot handle, it results in shift/reduce conflicts or reduce/reduce conflicts, where the parser may enter a state in which it cannot tell what action to take. In a shift/reduce conflict, it cannot tell whether it should add another symbol to the stack or perform some reduction on the top symbols of the stack. In a reduce/reduce conflict, the parser knows that it needs to replace the top symbols of the stack with some nonterminal, but it can't tell what reduction to use.

我道歉,如果这是一个漫长的论述,但我们需要这是能够解决的LR(0)和SLR(1)分析之间的差异。一个LR(0)分析器的转变/减少使用前瞻零令牌来决定采取什么样的行动(因此0)分析器。这意味着,在分析器的任何配置,解析器必须有确定的动作来选择 - 要么它转移的特定符号或施加特定降低。如果有以往的两个或更多的选择,以使,所述解析器将失败,我们说的语法不是LR(0)

I apologize if this is a lengthy exposition, but we need this to be able to address the difference between LR(0) and SLR(1) parsing. An LR(0) parser is a shift/reduce parser that uses zero tokens of lookahead to determine what action to take (hence the 0). This means that in any configuration of the parser, the parser must have an unambiguous action to choose - either it shifts a specific symbol or applies a specific reduction. If there are ever two or more choices to make, the parser fails and we say that the grammar is not LR(0).

回想一下,这两个可能的LR冲突转移/减少,减少/减少。在这两种情况下,也有在LR(0)自动机可采取至少两个动作,并且它不能告诉使用哪个它们。由于冲突动作中的至少一个是一个减少,攻击的一个合理线将是尝试有解析器更注意,当它执行特定降低。更具体地,让我们假设解析器允许看输入的下一个标记,以确定是否应改变或降低。如果我们只允许解析器减少时有意义这样做的(关于很有意义一些定义),那么我们可以能够消除冲突通过具有自动机特别选择要么移位或减少一特定步骤。

Recall that the two possible LR conflicts are shift/reduce and reduce/reduce. In both of these cases, there are at least two actions that the LR(0) automaton could be taking, and it can't tell which of them to use. Since at least one of the conflicting actions is a reduction, a reasonable line of attack would be to try to have the parser be more careful about when it performs a particular reduction. More specifically, let's suppose that the parser is allowed to look at the next token of input to determine whether it should shift or reduce. If we only allow the parser to reduce when it "makes sense" to do so (for some definition of "makes sense"), then we may be able to eliminate the conflict by having the automaton specifically choose to either shift or reduce in a particular step.

在单反(1)(简化LR(1)),解析器被允许决定是否应改变或降低时看超前的一个令牌。特别是,当解析器想尝试减少一些形式的&RARR的; W(对于非终结符号A和字符串u),它着眼于输入的下一个标记。如果该令牌可以合法地出现在一些推导非终结符A之后,解析器降低。否则,它没有。这里的直觉是,在某些情况下,它是没有意义的尝试减少,是因为考虑到我们目前为止看到的标记和即将到来的令牌,还有就是减少可能永远是正确的没有可能的方式。

In SLR(1) ("Simplified LR(1)"), the parser is allowed to look at one token of lookahead when deciding whether it should shift or reduce. In particular, when the parser wants to try reducing something of the form A → w (for nonterminal A and string w), it looks at the next token of input. If that token could legally appear after the nonterminal A in some derivation, the parser reduces. Otherwise, it does not. The intuition here is that in some cases it makes no sense to attempt a reduction, because given the tokens we've seen so far and the upcoming token, there is no possible way that the reduction could ever be correct.

LR(0)和单反之间唯一的区别(1)这是额外的能力来帮助确定出现冲突时采取何种行动。正因为如此,可以由用于LR(0)解析器解析任何语法可以通过单反(1)解析器解析。然而,SLR(1)解析器可以解析语法比LR更大数目(0)

The only difference between LR(0) and SLR(1) is this extra ability to help decide what action to take when there are conflicts. Because of this, any grammar that can be parsed by an LR(0) parser can be parsed by an SLR(1) parser. However, SLR(1) parsers can parse a larger number of grammars than LR(0).

在实践中,虽然,SLR(1)仍然是一个相当薄弱的解析方法。更常见的是,你会看到LALR(1)(先行LR(1))所使用的解析器。他们也通过试图解决冲突中的LR(0)分析器的工作,但它们用于解决冲突的规则都远远precise高于单反使用(1),和文法因此更多数量是LALR (1)比是SLR(1)。要成为一个更具体一点,SLR(1)语法分析器试图通过看语法结构,了解何时发生变化,当减少更多的信息来解决冲突。 LALR(1)语法分析器同时查看语法和LR(0)语法分析器来获得更具体的什么时候发生变化,当以减少信息。由于LALR(1)可以看看LR(0)分析器的结构,它可以更precisely确定当某些冲突是虚假的。 Linux的应用 YACC 野牛,默认情况下,产生LALR(1)语法分析器。

In practice, though, SLR(1) is still a fairly weak parsing method. More commonly, you will see LALR(1) ("Lookahead LR(1)") parsers being used. They too work by trying to resolve conflicts in an LR(0) parser, but the rules they use for resolving conflicts are far more precise than those used in SLR(1), and consequently a much larger number of grammars are LALR(1) than are SLR(1). To be a bit more specific, SLR(1) parsers try to resolve conflicts by looking at the structure of the grammar to learn more information about when to shift and when to reduce. LALR(1) parsers look at both the grammar and the LR(0) parser to get even more specific information about when to shift and when to reduce. Because LALR(1) can look at the structure of the LR(0) parser, it can more precisely identify when certain conflicts are spurious. The Linux utilities yacc and bison, by default, produce LALR(1) parsers.

从历史上看,LALR(1)是典型的构造,通过对更强大的LR的依赖(1)语法分析器不同的方法解析器,所以你会经常看到LALR(1)中描述的方式。要理解这一点,我们需要谈谈LR(1)语法分析器。在用于LR(0)分析器,分析器的工作原理通过跟踪它可能是在生产的中间。一旦发现,它达到了生产结束时,它知道要尽量减少。然而,解析器可能无法告诉它是否是在一条生产的结束和另一个的中间,这导致位移/减少冲突,或其中两种不同的生产已达到的端部(一个减少/减少冲突)。在LR(0),这立即导致一个冲突和解析器失败。在单反(1)或LALR(1),分析器然后进行转移或基于先行的下一个标记上减少的决定。

Historically, LALR(1) parsers were typically constructed through a different method that relied on the far more powerful LR(1) parser, so you will often see LALR(1) described that way. To understand this, we need to talk about LR(1) parsers. In an LR(0) parser, the parser works by keeping track of where it might be in the middle of a production. Once it has found that it's reached the end of a production, it knows to try to reduce. However, the parser might not be able to tell whether it's in at the end of one production and the middle of another, which leads to a shift/reduce conflict, or which of two different productions it has reached the end of (a reduce/reduce conflict). In LR(0), this immediately leads to a conflict and the parser fails. In SLR(1) or LALR(1), the parser then makes the decision to shift or reduce based on the next token of lookahead.

在一个LR(1)语法分析器,分析器跟踪更多的信息,因为它运行。除了跟踪什么是正在使用的生产解析器相信,它会跟踪可能会出现哪些可能的令牌,制作完毕后。因为解析器会跟踪这些信息在每一个步骤,而不是只当它需要做出的决定,LR(1)语法分析器基本功能更强大,precise比任何LR(0),SLR(1 ),或LALR(1)我们已经谈过到目前为止解析器。 LR(1)是一个非常强大的分析技术,它可以使用一些棘手的数学的,可以确定性地通过任何移来解析任何语言/ reduce解析器显示有一些语法,可以用解析LR(1)的自动机。 (请注意,这并不意味着所有的语法,可以确定性地解析是LR(1),这只能说是可以解析的语言确定性有一定的LR(1)语法)。但是,这种权力是有代价的,而产生的LR(1)语法分析器,可能需要这么多的信息来操作,它不可能在实践中使用。一个LR(1)语法分析器为一个真正的编程语言,例如,可能需要几十到几百正确操作的更多信息兆。出于这个原因,LR(1)一般不使用在实践中,和较弱解析器像LALR(1)或SLR(1)被用来代替

In an LR(1) parser, the parser keeps track of additional information as it operates. In addition to keeping track of what production the parser believes is being used, it keeps track of what possible tokens might appear after that production is completed. Because the parser keeps track of this information at each step, and not just when it needs to make the decision, the LR(1) parser is substantially more powerful and precise than any of the LR(0), SLR(1), or LALR(1) parsers we've talked about so far. LR(1) is an extremely powerful parsing technique, and it can be shown using some tricky math that any language that could be parsed deterministically by any shift/reduce parser has some grammar that could be parsed with an LR(1) automaton. (Note that this does not mean that all grammars that can be parsed deterministically are LR(1); this only says that a language that could be parsed deterministically has some LR(1) grammar). However, this power comes at a price, and a generated LR(1) parser may require so much information to operate that it can't possibly be used in practice. An LR(1) parser for a real programming language, for example, might require tens to hundreds of megabytes of additional information to operate correctly. For this reason, LR(1) isn't typically used in practice, and weaker parsers like LALR(1) or SLR(1) are used instead.

最近,称为GLR(0)(广义LR(0))已经得到普及一个新的解析算法。而不是试图解决出现在LR(0)分析器的冲突,GLR(0)分析器,而不是作品试图并行所有可能的选择。使用一些聪明的技巧,这可以做出许多语法运行效率很低。此外,GLR(0)可以解析的在任何上下文无关文法所有的,这是不能用的LR(k)的解析器对任意k被解析甚至语法。其他解析器能够做这个问题,以及对(例如,厄雷解析器或CYK解析器),虽然GLR(0)趋于更快在实践中

More recently, a new parsing algorithm called GLR(0) ("Generalized LR(0)") has gained popularity. Rather than trying to resolve the conflicts that appear in an LR(0) parser, the GLR(0) parser instead works by trying all possible options in parallel. Using some clever tricks, this can be made to run very efficiently for many grammars. Moreover, GLR(0) can parse any context-free grammar at all, even grammars that can't be parsed by an LR(k) parser for any k. Other parsers are capable of doing this as well (for example, the Earley parser or a CYK parser), though GLR(0) tends to be faster in practice.

如果您有兴趣了解更多,在这个夏天,我教的介绍编译过程中,花了不到2周谈论解析技术。如果你想获得一个更严格的介绍LR(0),SLR(1),和许多其他强大的解析技术,你可能会喜欢我的演讲幻灯片和家庭作业有关分析。所有的课程资料都可以的 在这里我的个人网站

If you're interested in learning more, over this summer I taught an introductory compilers course and spent just under two weeks talking about parsing techniques. If you'd like to get a more rigorous introduction to LR(0), SLR(1), and a host of other powerful parsing techniques, you might enjoy my lecture slides and homework assignments about parsing. All of the course materials are available here on my personal site.

希望这有助于!

这篇关于LR(0)和SLR分析之间的区别是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆