LL和递归下降解析器之间的区别? [英] Difference between an LL and Recursive Descent parser?

查看:214
本文介绍了LL和递归下降解析器之间的区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近正在尝试自学解析器(针对语言/无上下文语法)的工作方式,除了一件事之外,大多数解析器似乎都有意义.我将注意力集中在 LL(k)语法上,针对这两个主要算法来说,它们是 递归下降解析器 (仅使用递归).

I've recently being trying to teach myself how parsers (for languages/context-free grammars) work, and most of it seems to be making sense, except for one thing. I'm focusing my attention in particular on LL(k) grammars, for which the two main algorithms seem to be the LL parser (using stack/parse table) and the Recursive Descent parser (simply using recursion).

据我所知,递归下降算法适用于所有LL(k)语法,甚至可能更多,而LL解析器适用于所有LL(k)语法.但是,递归下降解析器显然比LL解析器要简单得多(就像LL解析器比LR解析器一样简单.)

As far as I can see, the recursive descent algorithm works on all LL(k) grammars and possibly more, whereas an LL parser works on all LL(k) grammars. A recursive descent parser is clearly much simpler than an LL parser to implement, however (just as an LL one is simpler than an LR one).

所以我的问题是,使用两种算法时可能遇到的优点/问题是什么?既然它在相同的语法集上工作并且实施起来比较棘手,为什么为什么会选择LL而不是递归血统呢?

So my question is, what are the advantages/problems one might encounter when using either of the algorithms? Why might one ever pick LL over recursive descent, given that it works on the same set of grammars and is trickier to implement?

推荐答案

LL通常是一种比递归下降更有效的解析技术.实际上,在最坏的情况下,幼稚的递归下降解析器实际上将是 O(k ^ n)(其中 n 是输入大小).记忆等一些技术(产生 Packrat 解析器)可以改善并扩展解析器接受的语法类别,但始终存在空间折衷. LL解析器(据我所知)始终是线性时间.

LL is usually a more efficient parsing technique than recursive-descent. In fact, a naive recursive-descent parser will actually be O(k^n) (where n is the input size) in the worst case. Some techniques such as memoization (which yields a Packrat parser) can improve this as well as extend the class of grammars accepted by the parser, but there is always a space tradeoff. LL parsers are (to my knowledge) always linear time.

从另一方面来说,您的直觉是正确的,即递归下降解析器可以处理比LL更大的语法.递归下降可以处理任何LL(*)的语法(即 unlimited lookahead)以及少量的歧义语法.这是因为递归下降实际上是PEG的直接编码实现,即解析器表达语法.具体来说,析取运算符(a | b)不是可交换的,这意味着a | b不等于b | a.递归下降解析器将按顺序尝试每个替代方案.因此,如果a与输入匹配,即使b 将与输入匹配,它也会成功.这样就可以简单地通过正确排序析取符来处理经典的最长匹配"歧义,例如悬空的else问题.

On the flip side, you are correct in your intuition that recursive-descent parsers can handle a greater class of grammars than LL. Recursive-descent can handle any grammar which is LL(*) (that is, unlimited lookahead) as well as a small set of ambiguous grammars. This is because recursive-descent is actually a directly-encoded implementation of PEGs, or Parser Expression Grammar(s). Specifically, the disjunctive operator (a | b) is not commutative, meaning that a | b does not equal b | a. A recursive-descent parser will try each alternative in order. So if a matches the input, it will succede even if b would have matched the input. This allows classic "longest match" ambiguities like the dangling else problem to be handled simply by ordering disjunctions correctly.

话虽如此,使用递归下降实现LL(k)解析器以使其在线性时间内运行是可能的.这实际上是通过内联预测集来完成的,以便每个解析例程在恒定时间内确定给定输入的适当乘积.不幸的是,这种技术消除了整个语法类别的处理.一旦进入预测性分析,像悬空else这样的问题就不再那么容易解决了.

With all of that said, it is possible to implement an LL(k) parser using recursive-descent so that it runs in linear time. This is done by essentially inlining the predict sets so that each parse routine determines the appropriate production for a given input in constant time. Unfortunately, such a technique eliminates an entire class of grammars from being handled. Once we get into predictive parsing, problems like dangling else are no longer solvable with such ease.

至于为什么选择LL而不是递归下降,这主要是效率和可维护性的问题.递归下降解析器的实现明显更容易,但是它们通常更难维护,因为它们表示的语法不存在任何声明性形式.大多数非平凡的解析器用例都使用解析器生成器,例如ANTLR或Bison.有了这样的工具,算法是直接编码的递归下降还是表驱动的LL(k)都没关系.

As for why LL would be chosen over recursive-descent, it's mainly a question of efficiency and maintainability. Recursive-descent parsers are markedly easier to implement, but they're usually harder to maintain since the grammar they represent does not exist in any declarative form. Most non-trivial parser use-cases employ a parser generator such as ANTLR or Bison. With such tools, it really doesn't matter if the algorithm is directly-encoded recursive-descent or table-driven LL(k).

出于兴趣,还值得研究递归上升,一种以递归下降方式直接编码但可以处理任何LALR语法的解析算法.我还将深入研究解析器组合器,这是将递归下降解析器组合在一起的一种有效方式.

As a matter of interest, it is also worth looking into recursive-ascent, which is a parsing algorithm directly encoded after the fashion of recursive-descent, but capable of handling any LALR grammar. I would also dig into parser combinators, which are a functional way of composing recursive-descent parsers together.

这篇关于LL和递归下降解析器之间的区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆